A face recognition task requires to store multidimensional array and make calculations over it. This task is not mostly matching the capabilities of relational databases and SQL. However, RDBMS comes with elegant, neat and well organized structures. In this post, we will mention how to use a relational database in a face recognition pipeline.
Vlog
You can either continue to read this tutorial or watch the following video. They both cover the exact nearest neighbor algorithm with sql based relational database SQLite. Even though we used SQLite in our experiments, any relational database can be adopted such as Oracle, DB2 or MS SQL.
🙋♂️ You may consider to enroll my top-rated machine learning course on Udemy
Face recognition pipeline
A modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify.
Luckily, deepface for python covers all of those stages. It wraps several state-of-the-art face recognition models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, Dlib, ArcFace. Those models passed the human level accuracy already. In this post, we will use FaceNet model to represent facial images as vectors. The model expects 160, 160 shaped inputs and 128 dimensional representations.
from deepface import DeepFace
Local facial data set
We will use the unit test items of deepface. Let’s find the image names first.
facial_img_paths = [] #Available at: https://github.com/serengil/deepface/tree/master/tests/dataset for root, directory, files in os.walk('../deepface/tests/dataset'): for file in files: if '.jpg' in file: facial_img_paths.append(root+'/'+file)
Once we have the exact image names, then deepface can handle pre-processing stages of a pipeline: detect and align.
import pandas as pd from deepface.commons import functions instances = [] for i in tqdm(range(0, len(facial_img_paths))): facial_img_path = facial_img_paths[i] embedding = DeepFace.represent(img_path = facial_img_path, model_name = "Facenet")[0]["embedding"] #store instance = [] instance.append(facial_img_path) instance.append(embedding) instances.append(instance) df = pd.DataFrame(instances, columns = ['img_name', 'embedding'])
Deepface wraps several face detectors: opencv, ssd, mtcnn and dlib. MTCNN is the most robust one but it is the slowest as well. SSD is the fastest one but its alignment score is not as high as mtcnn.
You can monitor the face detection performance of those backends in the following video.
You can find out the math behind alignment more on the following video:
Besides, face detectors detect faces in a rectangle area. So, detected faces come with some noise such as background color. We can find 68 different landmarks of a face with dlib. In this way, we can get rid of any noise of a facial image.
Herein, retinaface is the cutting-edge technology for face detection. It can even detect faces in the crowd. Besides, it finds some facial landmarks including eye coordinates. In this way, its alignment score is high as well.
SQLite
We are going to use Sqlite as a database in this study. However, it could be adapted to Oracle, MySQL, MS SQL or DB2 as well. My choice is mainly based on its lightweightness. Here, you can download the precompiled binaries. My tests were done in 3.34.1 version. You just need to unzip the downloaded zipped file and its location into path variable.
Creating database
Once you added the location of sqlite, then you can call sqlite3 command in command prompt. Then, go to your working directory, call sqlite dll and pass database name as an argument. This will activate sqlite command line shell as well. Here, you should call .databases command.
sqlite3 facialdb.db .databases
We can call any database related command in sqlite command line shell. However, we will not need it anymore. We will handled everything in its python interface.
Connecting database
Sqlite is an out-of-the-box dependency for python 3. You don’t have to install any external package to communicate with sqlite db.
conn = sqlite3.connect('facialdb.db') cursor = conn.cursor()
Creating database tables
We will represent a facial image as 128 dimensional vector. We will store this embedding instead of image itself. We can either convert this 128 dimensional vector to binary and store it in a blob field or we store each dimension value in a decimal field.
The both approaches have pros and cons. Storing vector as a blob requires to make all calculations in the client side. If you have a limited database power, this option might be better. On the other hand, storing each dimension in a field requires to make all calculation in the database. This might be better if you have a thin client such as Raspberry or stronger database such as Oracle exadata. We will implement they both in this study.
cursor.execute('''create table face_meta (ID INT primary key, IMG_NAME VARCHAR(10), EMBEDDING BLOB)''') cursor.execute('''create table face_embeddings (FACE_ID INT, DIMENSION INT, VALUE DECIMAL(5, 30))''')
I will have a main metadata table named face_meta. This stores the image name and binary version of its representation.
I have created a dedicated table to store each dimension value. Alternatively, I can store each dimension value in a column in the metadata table but face recognition models could create outputs with different dimension size. I mean that if I use VGG-Face instead of FaceNet, then database design will be different. However, I can use either VGG-Face or FaceNet with this design.
Store local dataset in database
Pandas dataframe already stores the image name and its embedding as columns. We will walk over the rows of the data frame. Then, store name and its embedding as binary in meta table. Thereafter, walk over the dimension values in the embedding and store each dimension value as a row.
for index, instance in tqdm(df.iterrows(), total=df.shape[0]): img_name = instance['img_name'] embeddings = instance['embedding'] insert_statement = 'INSERT INTO face_meta (ID, IMG_NAME, EMBEDDING) VALUES (?, ?, ?)' insert_args = (index, img_name, embeddings.tobytes()) cursor.execute(insert_statement, insert_args) for i, embedding in enumerate(embeddings): insert_statement = 'INSERT INTO face_embeddings (FACE_ID, DIMENSION, VALUE) VALUES (?, ?, ?)' insert_args = (index, i, str(embedding)) cursor.execute(insert_statement, insert_args) conn.commit()
The both metadata and representation tables are done. You can check this if you run a select query in a command shell line for sqlite.
Target image
We are going to look for the identity of a new target image in the database. Notice that this image will not appear in the datase. We have to apply same pre-processing and representation stages.
target_img_path = '../target.png' target_img = DeepFace.extract_faces(img_path = target_img_path)[0]["face"] target_embedding = DeepFace.represent(img_path = target_img_path, model_name = "Facenet")[0]["embedding"]
Server side solution
This approach is my favorite because everything is handled in the sql query. If your database server is powerful (e.g. Oracle Exadata), this approach matches your satisfactions well.
Notice that FaceNet represents facial images as 128 dimensional vectors. So, there are 128 rows for each facial images in the face_embeddings table. Here, we have to express the target image representation as 128 rows as well. Basically, I found the following code snippet to handle this expression. I walks over the dimension values and marks each with its index.
target_statement = '' for i, value in enumerate(target_embedding): target_statement += 'select %d as dimension, %s as value' % (i, str(value)) #sqlite #target_statement += 'select %d as dimension, %s as value from dual' % (i, str(value)) #oracle if i < len(target_embedding) - 1: target_statement += ' union all '
Then, I declared it as table and named it target. In this way, I can apply a left join based on dimension index.
select_statement = f''' select * from ( select img_name, sum(subtract_dims) as distance_squared from ( select img_name, (source - target) * (source - target) as subtract_dims from ( select meta.img_name, emb.value as source, target.value as target from face_meta meta left join face_embeddings emb on meta.id = emb.face_id left join ( {target_statement} ) target on emb.dimension = target.dimension ) ) group by img_name ) where distance_squared < 100 order by distance_squared asc '''
Euclidean distance calculation requires squared root but there is not built-in function for sqrt in sqlite. My intent was discarding the facial images who have a distance greater than the value 10. This is the threshold for FaceNet and Euclidean distance pair. If I cannot find the squared root value in the sql select statemet, then I check that the sum of the squared subtraction of dimensions is less then 100.
results = cursor.execute(select_statement) instances = [] for result in results: img_name = result[0] distance_squared = result[1] instance = [] instance.append(img_name) instance.append(math.sqrt(distance_squared)) instances.append(instance) result_df = pd.DataFrame(instances, columns = ['img_name', 'distance'])
Client side solution
Remember that facial representations were stored as blob as well. We will retrieve the all table first. Then, find euclidean distance values for each item in the table with the target one in the client side.
select_statement = 'select img_name, embedding from face_meta' results = cursor.execute(select_statement) instances = [] for result in results: img_name = result[0] embedding_bytes = result[1] embedding = np.frombuffer(embedding_bytes, dtype = 'float32') instance = [] instance.append(img_name) instance.append(embedding) instances.append(instance) result_df = pd.DataFrame(instances, columns = ['img_name', 'embedding'])
Dataframe stores image name and its representation as columns. Let’s add the target representation as a column.
target_duplicated = np.array([target_embedding,]*result_df.shape[0]) result_df['target'] = target_duplicated.tolist()
Then, we will walk over the rows of dataframe. The source and target representations are columns. We can find the distance value for each row. Once, distance values found for each row, we can ignore the higher ones.
def findEuclideanDistance(row): source = np.array(row['embedding']) target = np.array(row['target']) distance = (source - target) return np.sqrt(np.sum(np.multiply(distance, distance))) result_df['distance'] = result_df.apply(findEuclideanDistance, axis = 1) result_df = result_df[result_df['distance'] <= 10] result_df = result_df.sort_values(by = ['distance']).reset_index(drop = True)
They are all Angelina Jolie. It seems the both client and server side solutions working.
Databaseless solution
You might not need a database always. Herein, deepface offers an out-of-the-box find function to handle this task.
df = DeepFace.find(img_path = target_img_path, db_path = 'deepface/tests/dataset' , model_name = 'Facenet', model = model, distance_metric = 'euclidean' , detector_backend = 'opencv') print(df.head())
It can verify face pairs with a single line of code.
Face recognition requires to apply face verification several times in the background. Here, it can also look for the identity of a facial image in a dataset as well.
The Best Single Model
DeepFace has many cutting-edge models in its portfolio. Find out the best configuration for facial recognition model, detector, similarity metric and alignment mode.
DeepFace API
DeepFace offers a web service for face verification, facial attribute analysis and vector embedding generation through its API. You can watch a tutorial on using the DeepFace API here:
Additionally, DeepFace can be run with Docker to access its API. Learn how in this video:
Large scale face recognition
Notice that face recognition has a O(n) time complexity and it becomes very problematic for really large data. If you have millions level data, this experiment will not help you unless you run it on an Oracle Exadata. You might run it on big data systems: Hadoop, Cassandra, Redis or MongoDB. It comes with a power of map reduce technology.
Super Fast Vector Search
In this post, we focused on using the k-NN algorithm to find similar vectors. However, this approach becomes problematic with large databases due to its time complexity of O(n + n log(n)). Imagine indexing all images on Google! To address this, we use the approximate nearest neighbor algorithm, which significantly reduces complexity and allows for super-fast vector searches. With this method, you can find the nearest vectors in a billion-scale database in just milliseconds. Many vector databases and indexing tools, such as Annoy, Faiss, ElasticSearch, NMSLIB, and Redis, adopt a similar approach.
Tech Stack Recommendations
Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.
Conclusion
So, we have mentioned how to use a relational database in a face recognition pipeline. It makes your pipeline elegant and well organized. This approach is very pretty for mid-level data and thin clients.
I pushed the source code of this study to GitHub. You can support this study if you star⭐️ the repo🙏.
Support this blog if you do like!
This page seems to have an encoding error for part of the code and I’m having trouble trying to figure out what it should be. What is this line?
result_df = result_df[result_df[‘distance’] <= 10]
result_df = result_df[result_df[‘distance’] <= 10]
Would it be possible to pull the faces from the database to run age, gender, etc detection on? It seemed like you can’t use the output of facenet to run these detections on but I wanted to make sure since it would be very convenient to be able to store faces in a database to later on get statistics on
I was trying to reproduce the code above but using mysql and the search is giving a null result and I am using the images that are in the example could you give me a little help?
The following is the code snippet in mysql that returns me none.
def pesquisar_emb(target_embedding):
conn,cursor = conSvDb()
target_statement = ”
for i, value in enumerate(target_embedding):
target_statement += ‘SELECT %d AS dimensao, %s AS valor’ % (i, str(value)) # MySQL não suporta “FROM DUAL” como no Oracle
if i < len(target_embedding) – 1:
target_statement += ' UNION ALL '
select_statement = f'''
SELECT nome_imagem, SUM(subtract_dims) AS distance_squared
FROM (
SELECT nome_imagem, (source – target) * (source – target) AS subtract_dims
FROM (
SELECT fm.nome_imagem, fe.valor AS source, target.valor AS target
FROM rostos fm
LEFT JOIN rosto_embeddings fe ON fm.id = fe.rosto_id
LEFT JOIN (
{target_statement}
) AS target
ON fe.dimensao = target.dimensao
) AS subquery1
) AS subquery2
GROUP BY nome_imagem
HAVING distance_squared < 100
ORDER BY distance_squared ASC
'''
results = cursor.execute(select_statement)
return results