Deep Face Recognition with Relational Databases and SQL

A face recognition task requires to store multidimensional array and make calculations over it. This task is not mostly matching the capabilities of relational databases and SQL. However, RDBMS comes with elegant, neat and well organized structures. In this post, we will mention how to use a relational database in a face recognition pipeline.

Arya in The Hall of Faces, Game Of Thrones
Face recognition pipeline

A modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

Luckily, deepface for python covers all of those stages. It wraps several state-of-the-art face recognition models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, Dlib, ArcFace. Those models passed the human level accuracy already. In this post, we will use FaceNet model to represent facial images as vectors. The model expects 160, 160 shaped inputs and 128 dimensional representations.

from deepface import DeepFace

FaceNet, VGG-Face and ArcFace overperform among others. Here, you can watch how to determine the best model.

Local facial data set

We will use the unit test items of deepface. Let’s find the image names first.

facial_img_paths = []
#Available at: https://github.com/serengil/deepface/tree/master/tests/dataset
for root, directory, files in os.walk('../deepface/tests/dataset'):
    for file in files:
        if '.jpg' in file:
            facial_img_paths.append(root+'/'+file)

Once we have the exact image names, then deepface can handle pre-processing stages of a pipeline: detect and align.

import pandas as pd
from deepface.commons import functions

instances = []
for i in tqdm(range(0, len(facial_img_paths))):
    facial_img_path = facial_img_paths[i]
    embedding = DeepFace.represent(img_path = facial_img_path, model_name = "Facenet")[0]["embedding"]
    
    #store
    instance = []
    instance.append(facial_img_path)
    instance.append(embedding)
    instances.append(instance)

df = pd.DataFrame(instances, columns = ['img_name', 'embedding'])

Deepface wraps several face detectors: opencv, ssd, mtcnn and dlib. MTCNN is the most robust one but it is the slowest as well. SSD is the fastest one but its alignment score is not as high as mtcnn.

You can monitor the face detection performance of those backends in the following video.

You can find out the math behind alignment more on the following video:

Besides, face detectors detect faces in a rectangle area. So, detected faces come with some noise such as background color. We can find 68 different landmarks of a face with dlib. In this way, we can get rid of any noise of a facial image.





Herein, retinaface is the cutting-edge technology for face detection. It can even detect faces in the crowd. Besides, it finds some facial landmarks including eye coordinates. In this way, its alignment score is high as well.

SQLite

We are going to use Sqlite as a database in this study. However, it could be adapted to Oracle, MySQL, MS SQL or DB2 as well. My choice is mainly based on its lightweightness. Here, you can download the precompiled binaries. My tests were done in 3.34.1 version. You just need to unzip the downloaded zipped file and its location into path variable.

Creating database

Once you added the location of sqlite, then you can call sqlite3 command in command prompt. Then, go to your working directory, call sqlite dll and pass database name as an argument. This will activate sqlite command line shell as well. Here, you should call .databases command.

sqlite3 facialdb.db
.databases

We can call any database related command in sqlite command line shell. However, we will not need it anymore. We will handled everything in its python interface.

Connecting database

Sqlite is an out-of-the-box dependency for python 3. You don’t have to install any external package to communicate with sqlite db.

conn = sqlite3.connect('facialdb.db')
cursor = conn.cursor()
Creating database tables

We will represent a facial image as 128 dimensional vector. We will store this embedding instead of image itself. We can either convert this 128 dimensional vector to binary and store it in a blob field or we store each dimension value in a decimal field.

The both approaches have pros and cons. Storing vector as a blob requires to make all calculations in the client side. If you have a limited database power, this option might be better. On the other hand, storing each dimension in a field requires to make all calculation in the database. This might be better if you have a thin client such as Raspberry or stronger database such as Oracle exadata. We will implement they both in this study.

cursor.execute('''create table face_meta (ID INT primary key, IMG_NAME VARCHAR(10), EMBEDDING BLOB)''')
cursor.execute('''create table face_embeddings (FACE_ID INT, DIMENSION INT, VALUE DECIMAL(5, 30))''')

I will have a main metadata table named face_meta. This stores the image name and binary version of its representation.

I have created a dedicated table to store each dimension value. Alternatively, I can store each dimension value in a column in the metadata table but face recognition models could create outputs with different dimension size. I mean that if I use VGG-Face instead of FaceNet, then database design will be different. However, I can use either VGG-Face or FaceNet with this design.

Store local dataset in database

Pandas dataframe already stores the image name and its embedding as columns. We will walk over the rows of the data frame. Then, store name and its embedding as binary in meta table. Thereafter, walk over the dimension values in the embedding and store each dimension value as a row.





for index, instance in tqdm(df.iterrows(), total=df.shape[0]):
    img_name = instance['img_name']
    embeddings = instance['embedding']
    
    insert_statement = 'INSERT INTO face_meta (ID, IMG_NAME, EMBEDDING) VALUES (?, ?, ?)'
    insert_args = (index, img_name, embeddings.tobytes())
    cursor.execute(insert_statement, insert_args)
    
    for i, embedding in enumerate(embeddings):
        insert_statement = 'INSERT INTO face_embeddings (FACE_ID, DIMENSION, VALUE) VALUES (?, ?, ?)'
        insert_args = (index, i, str(embedding))
        cursor.execute(insert_statement, insert_args)

conn.commit()

The both metadata and representation tables are done. You can check this if you run a select query in a command shell line for sqlite.

Checking tables with sqlite command shell line
Target image

We are going to look for the identity of a new target image in the database. Notice that this image will not appear in the datase. We have to apply same pre-processing and representation stages.

target_img_path = '../target.png'
target_img = DeepFace.extract_faces(img_path = target_img_path)[0]["face"]
target_embedding = DeepFace.represent(img_path = target_img_path, model_name = "Facenet")[0]["embedding"]
Angelina Jolie as a target
Server side solution

This approach is my favorite because everything is handled in the sql query. If your database server is powerful (e.g. Oracle Exadata), this approach matches your satisfactions well.

Notice that FaceNet represents facial images as 128 dimensional vectors. So, there are 128 rows for each facial images in the face_embeddings table. Here, we have to express the target image representation as 128 rows as well. Basically, I found the following code snippet to handle this expression. I walks over the dimension values and marks each with its index.

target_statement = ''
for i, value in enumerate(target_embedding):
    target_statement += 'select %d as dimension, %s as value' % (i, str(value)) #sqlite
    #target_statement += 'select %d as dimension, %s as value from dual' % (i, str(value)) #oracle
    
    if i < len(target_embedding) - 1:
        target_statement += ' union all '

Then, I declared it as table and named it target. In this way, I can apply a left join based on dimension index.

select_statement = f'''
    select * 
    from (
        select img_name, sum(subtract_dims) as distance_squared
        from (
            select img_name, (source - target) * (source - target) as subtract_dims
            from (
                select meta.img_name, emb.value as source, target.value as target
                from face_meta meta left join face_embeddings emb
                on meta.id = emb.face_id
                left join (
                    {target_statement}  
                ) target
                on emb.dimension = target.dimension
            )
        )
        group by img_name
    )
    where distance_squared < 100
    order by distance_squared asc
'''

Euclidean distance calculation requires squared root but there is not built-in function for sqrt in sqlite. My intent was discarding the facial images who have a distance greater than the value 10. This is the threshold for FaceNet and Euclidean distance pair. If I cannot find the squared root value in the sql select statemet, then I check that the sum of the squared subtraction of dimensions is less then 100.

results = cursor.execute(select_statement)
instances = []
for result in results:
    img_name = result[0]
    distance_squared = result[1]
    
    instance = []
    instance.append(img_name)
    instance.append(math.sqrt(distance_squared))
    instances.append(instance)

result_df = pd.DataFrame(instances, columns = ['img_name', 'distance'])
Client side solution

Remember that facial representations were stored as blob as well. We will retrieve the all table first. Then, find euclidean distance values for each item in the table with the target one in the client side.

select_statement = 'select img_name, embedding from face_meta'
results = cursor.execute(select_statement)

instances = []
for result in results:
    img_name = result[0]
    embedding_bytes = result[1]
    embedding = np.frombuffer(embedding_bytes, dtype = 'float32')
    
    instance = []
    instance.append(img_name)
    instance.append(embedding)
    instances.append(instance)

result_df = pd.DataFrame(instances, columns = ['img_name', 'embedding'])

Dataframe stores image name and its representation as columns. Let’s add the target representation as a column.

target_duplicated = np.array([target_embedding,]*result_df.shape[0])
result_df['target'] = target_duplicated.tolist()

Then, we will walk over the rows of dataframe. The source and target representations are columns. We can find the distance value for each row. Once, distance values found for each row, we can ignore the higher ones.

def findEuclideanDistance(row):
    source = np.array(row['embedding'])
    target = np.array(row['target'])
    distance = (source - target)
    return np.sqrt(np.sum(np.multiply(distance, distance)))

result_df['distance'] = result_df.apply(findEuclideanDistance, axis = 1)
result_df = result_df[result_df['distance'] <= 10]
result_df = result_df.sort_values(by = ['distance']).reset_index(drop = True)

They are all Angelina Jolie. It seems the both client and server side solutions working.





Results
Databaseless solution

You might not need a database always. Herein, deepface offers an out-of-the-box find function to handle this task.

df = DeepFace.find(img_path = target_img_path, db_path = 'deepface/tests/dataset'
, model_name = 'Facenet', model = model, distance_metric = 'euclidean'
, detector_backend = 'opencv')
print(df.head())

It can verify face pairs with a single line of code.

Face recognition requires to apply face verification several times in the background. Here, it can also look for the identity of a facial image in a dataset as well.

Large scale face recognition

Notice that face recognition has a O(n) time complexity and it becomes very problematic for really large data. If you have millions level data, this experiment will not help you unless you run it on an Oracle Exadata. You might run it on big data systems: Hadoop, Cassandra, Redis or MongoDB. It comes with a power of map reduce technology.

Besides, approximate nearest neighbor algorithms reduce the time complexity of this problem dramatically. Spotify Annoy, Facebook Faiss, NMSLIB or Elasticsearch might be better if you have billions level data.

Tech Stack Recommendations

Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.

Conclusion

So, we have mentioned how to use a relational database in a face recognition pipeline. It makes your pipeline elegant and well organized. This approach is very pretty for mid-level data and thin clients.

I pushed the source code of this study to GitHub. You can support this study if you star⭐️ the repo🙏.


Like this blog? Support me on Patreon

Buy me a coffee


4 Comments

  1. This page seems to have an encoding error for part of the code and I’m having trouble trying to figure out what it should be. What is this line?
    result_df = result_df[result_df[‘distance’] &amp;amp;amp;amp;amp;amp;lt;= 10]

  2. Would it be possible to pull the faces from the database to run age, gender, etc detection on? It seemed like you can’t use the output of facenet to run these detections on but I wanted to make sure since it would be very convenient to be able to store faces in a database to later on get statistics on

  3. I was trying to reproduce the code above but using mysql and the search is giving a null result and I am using the images that are in the example could you give me a little help?
    The following is the code snippet in mysql that returns me none.

    def pesquisar_emb(target_embedding):
    conn,cursor = conSvDb()
    target_statement = ”
    for i, value in enumerate(target_embedding):
    target_statement += ‘SELECT %d AS dimensao, %s AS valor’ % (i, str(value)) # MySQL não suporta “FROM DUAL” como no Oracle
    if i < len(target_embedding) – 1:
    target_statement += ' UNION ALL '

    select_statement = f'''
    SELECT nome_imagem, SUM(subtract_dims) AS distance_squared
    FROM (
    SELECT nome_imagem, (source – target) * (source – target) AS subtract_dims
    FROM (
    SELECT fm.nome_imagem, fe.valor AS source, target.valor AS target
    FROM rostos fm
    LEFT JOIN rosto_embeddings fe ON fm.id = fe.rosto_id
    LEFT JOIN (
    {target_statement}
    ) AS target
    ON fe.dimensao = target.dimensao
    ) AS subquery1
    ) AS subquery2
    GROUP BY nome_imagem
    HAVING distance_squared < 100
    ORDER BY distance_squared ASC
    '''

    results = cursor.execute(select_statement)
    return results

Comments are closed.