How to Find False Positives in Facial Recognition with Neo4j

Facial recognition models already reached and passed human level accuracy. Still, these cutting-edge models do some clear mistakes when verifying face pairs. Herein, graphs are amazing tools to analyze some obvious patterns. In this post, we are going to use both deepface and neo4j to detect false positive cases in facial recognition tasks.

Pregnancy test (pexels)

Neo4j Nodes 2022 Talk

I had a talk at nodes 2022 event and explained high level how to detect false negatives and false positives in facial recognition. You might want to watch this video before reading this tutorial. But this tutorial explains this subject much deeper with hands-on programming.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

Vlog

You can either watch the following video or continue reading this tutorial. They both cover the finding false positives in facial recognition with neo4j.

Dataset

We will use the deepface unit test items in this experiment. There are 61 images of 12 different person. We will decide that they are same person or they are different persons for each image set. This requires to verify 61×60=3660 pairs. Let’s find the exact image paths with image names first.

import os

img_paths = []
for root, dirs, files in os.walk("../deepface/tests/dataset/"):
    for file in files:
        if '.jpg' in file:
            img_path = root+file
            img_paths.append(img_path)

print(f"there are {len(img_paths)} items.")

Embeddings

Facial recognition models are responsible for representing facial images as vectors. Face detection and alignment are also early stages of facial recognition pipeline. They aim to feed clear inputs to facial recognition models. I prefer to use FaceNet with 128 dimensional facial recognition model and MtCnn face detector in this experiment. Let’s find vector representations of deepface unit test items.

model_name = "Facenet"
detector_backend = "mtcnn"

instances = {}
for i in tqdm.tqdm(range(0, len(img_paths))):
    img_path = img_paths[i]
    label = img_path.split("/")[-1].split(".")[0]
    embedding = DeepFace.represent(img_path=img_path, 
                               model_name=model_name, 
                               detector_backend=detector_backend)[0]["embedding"]    
    instances[label] = embedding

Connecting to Neo4j

We need to initialize the connection first.

from neo4j import GraphDatabase, basic_auth

driver = GraphDatabase.driver("bolt://localhost:7687", auth=basic_auth("neo4j", "*****"))
session = driver.session()

#flush nodes
result = session.run("MATCH (n:Face) DETACH DELETE n")

Storing facial images in graph database

We are going to create nodes with Face node label for each facial image. Each node will have name and embedding as properties.

with session.begin_transaction() as trx:
    for i, img_path in enumerate(instances.keys()):
        statement = f"""
            MERGE (f{i}:Face {{name: '{img_path}'}})
            SET f{i}.embedding = {instances[img_path]}
        """
       trx.run(statement)
    trx.commit()

This will create 61 individual nodes in the neo4j side.

Nodes

Creating edges

We will create edges between nodes if deepface verifies the pair. Verification is decided if the distance between embeddings of that pair is less than a threshold value. We are using the following distance values for facial recognition model and distance metric pairs in deepface.

thresholds = {
	'VGG-Face': {'cosine': 0.40, 'euclidean': 0.60, 'euclidean_l2': 0.86},
        'Facenet':  {'cosine': 0.40, 'euclidean': 10, 'euclidean_l2': 0.80},
        'Facenet512':  {'cosine': 0.30, 'euclidean': 23.56, 'euclidean_l2': 1.04},
        'ArcFace':  {'cosine': 0.68, 'euclidean': 4.15, 'euclidean_l2': 1.13},
        'Dlib': {'cosine': 0.07, 'euclidean': 0.6, 'euclidean_l2': 0.4},
        'SFace': {'cosine': 0.5932763306134152, 'euclidean': 10.734038121282206, 'euclidean_l2': 1.055836701022614},
	'OpenFace': {'cosine': 0.10, 'euclidean': 0.55, 'euclidean_l2': 0.55},
	'DeepFace': {'cosine': 0.23, 'euclidean': 64, 'euclidean_l2': 0.64},
	'DeepID': {'cosine': 0.015, 'euclidean': 45, 'euclidean_l2': 0.17}
}

In this experiment, we will use cosine similarity as distance metric. Threshold was 0.40 for FaceNet and cosine pair. We will run the following cypher statement in the neo4j side.





MATCH (p1:Face)
MATCH (p2:Face)
WHERE p1.name <> p2.name
WITH p1, p2, 1-gds.similarity.cosine(p1.embedding, p2.embedding) as distance
WHERE distance < 0.40
MERGE (p1)-[e:distance]-(p2)
SET e.distance=distance

This will create edges between nodes according the decision procedures of deepface.

Nodes with edges

False positives

False positives should connect the strongly connected clusters weakly. Notice that the cluster on the right-up side of the graph. It consists of 3 strongly connected clusters. If we illustrate the real images on the nodes, this cluster has 6 false positive edges. Jack Dorsey, Matt Damon and Leonardo DiCaprio are connected to each other weakly whereas these identities are connected itself strongly.

False positive cases

This can be detected by human eye but we should formulate this pattern.

Graph data science functions

We need to activate gds functions in neo4j.conf because this is disabled in the default configuration

dbms.security.procedures.unrestricted=algo.*,apoc.*,gds.* 

Then, we need to take the snapshot of current graph once.

CALL gds.graph.project('myGraph', 'Face', {distance: {properties: 'distance'}}) 

Centrality

Neo4j comes with some out-of-the-box centrality functions. Betweenness centrality is one of them.

CALL gds.betweenness.stream('myGraph')
YIELD nodeId, score
WHERE score > 0
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC

This query returns 8 records at all.

Betweenness Centrality Results

Interestingly, edges between nodes img35, img30, img31, img32, img59, img62 are actually false positive decisions and they are all appearing in the betweenness centrality results. So, we can drop the edges between these results. In that way, we will have less false positive classifications.

Conclusion

So, we have combined deepface and neo4j in this post to detect false positive cases. As mentioned in the introduction, facial recognition models already passed the human level accuracy. This approach will let us to drop false positive decisions and this will make facial recognition models perfect!






Like this blog? Support me on Patreon

Buy me a coffee