Celebrity Look-Alike Face Recognition with Deep Learning in Keras

Finding the celebrity look-alike is a fun and an attractive topic. People wonder that they are similar to whom. Some applications appear on the market serving for this duty. However, they are run as black boxes. So, have you wonder how these applications work? This is all based on same principles to face recognition task. We will build a celebrity look-alike face recognition application from scratch in Keras and TensorFlow.

remimalek
Rami Malek as Freddie Mercury in Bohemian Rhapsody (2018)

Vlog

You can either continue to read this tutorial or watch the following video. They both cover the celebrity look-alike prediction with deepface facial recognition library.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

Also, we will be able to build a real time celebrity look alike cam at the end of this tutorial.

Face recognition

A convolutional neural network express faces as vector representations. We expect that distance between one’s different photo representations should be small (0 is the best). Herein, we mostly check that distance is smaller than a threshold (e.g. 0.20) to recognize a face.

We just ignore threshold checking step in finding similar celebrities task. People who have minimum distance score would be your celebrity look-alike. This finds the most similar celebrity in the target data set. Found one does not have to be a doppelganger.

Data set

I will use the data set shared by ETH Zurich University. Researcher shared this data set for age and gender prediction task but we can use same data set as is. I downloaded IMDB faces only (7 GB).

# Ref https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/
mat = scipy.io.loadmat('imdb_crop/imdb.mat')
columns = ["dob", "photo_taken", "full_path", "gender", "name", "face_location", "face_score", "second_face_score", "celeb_names", "celeb_id"]
instances = mat['imdb'][0][0][0].shape[1]
df = pd.DataFrame(index = range(0,instances), columns = columns)

for i in mat:
   if i == "imdb":
   current_array = mat[i][0][0]
   for j in range(len(current_array)):
      df[columns[j]] = pd.DataFrame(current_array[j][0])

Data source has some useless pictures. Some pictures don’t include any faces, some includes multiple faces and some shows ambiguous faces. We can discard them.

#remove pictures does not include face
df = df[df['face_score'] != -np.inf]

#some pictures include more than one face, remove them
df = df[df['second_face_score'].isna()]

#check threshold
df = df[df['face_score'] >= 3]

Data set includes the physical location of the image. We need to load them as pixel values. My choice is reading images with OpenCV because we can additionally crop faces with this library. This step lasts 150 seconds in my tests.

def getImagePixels(image_path):
   return cv2.imread("imdb_crop/" + image_path[0])

df['pixels'] = df['full_path'].apply(getImagePixels)

Representing images as vectors

We need to represent items in imdb data set as vectors. We will use DeepFace represent function for this task. Herein, Oxford’s VGG-Face and Google’s Facenet are model candidates to find vector representations of faces. They are both already implemented in my previous posts. VGG-Face is my favorite but it is all up-to-you. Imdb dataset already includes detected faces. So, I set detector backend argument to skip because I do not want to apply any detection or alignment.

This block lasts 669 seconds (>11 minutes) in my tests. Notice that we process them with Pandas and it runs on single CPU core.





from deepface import DeepFace

def findFaceRepresentation(img):
   try:
      representation = DeepFace.represent(img_path = img, model_name = "VGG-Face", detector_backend ="skip")
   except:
      representation = None

   return representation

df['face_vector_raw'] = df['pixels'].apply(findFaceRepresentation)

Now, you need to represent your image as vector. Cropped faces with 40% margin exist in imdb data set. That’s why, you should transform your picture to this condition. We should detect face in your image and add a margin. If your image already satisfies this condition you might skip face detection and adding margin steps. I used OpenCV’s haarcascade classifier to detect faces.

yourself_representation = DeepFace.represent(img_path = "sefik.jpg", detector_backend="opencv")

Finding similarity / distance

Both your image and images in imdb data set are represented as 2622 dimensional vectors. Now, we can find similarity of each image vector in imdb data set to your image vector. Cosine similarity or Euclidean distance can be used for this task. Larger similarity or smaller distance, the better it is.

from deepface.commons import distance

def findCosineSimilarity(source_representation, test_representation=yourself_representation):
   try:
      return distance.findCosineDistance(source_representation, yourself_representation)
   except:
       return 10 #assign a large value in exception. similar faces will have small value.

df['distance'] = df['face_vector_raw'].apply(findCosineSimilarity)

Now you know that how similar each image in imdb data set to your image. The less distance score (0 is the best), the more similarity exists. Let’s sort the data frame by distance value and focus on the top 3.

df = df.sort_values(by=['distance'], ascending=True)

for i in range(0, 3):
   instance = df.iloc[i]
   name = instance['celebrity_name']
   distance = instance['distance']
   full_path = instance['full_path'][0]
   img = cv2.imread("imdb_crop/" + full_path)
   print(i,".",name," (",similarity,")")
   plt.axis('off')
   plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
   plt.show()

Face detection

Train set images are already cropped and just facial areas are focused on. You should crop your testing images as well.

There are several face detection solutions. OpenCV offers haar cascade and single shot multibox detector (SSD). Dlib offers Histogram of Oriented Gradients (HOG) and Max-Margin Object Detection (MMOD). Finally Multi-task Cascaded Convolutional Networks (MTCNN) is a common solution for face detection.

Herein, haar cascade and HoG are legacy methods whereas SSD, MMOD and MTCNN are deep learning based modern solutions.

You can see the face detection performance of those models in the following video.

Here, you can watch how to use different face detectors in Python.

Testing

I tested this implementation for my own profile photo existing in about me page. Here are the top 3 results. It seems that deep learning system thinks Colin Hanks is the most similar one to me in imdb data set. The following picture has 73.10% accuracy.

sefik-looks-alike-colin-hanks-v2
Colin Hanks

Surprisingly, another photo of Colin Hanks appear in the podium. This almost has same similarity score. Besides, his pose is almost same in the base image. Rounding similarity score to integer puts this photo to same rank but this has 72.90% similarity score.





sefik-looks-alike-colin-hanks
Colin Hanks

Finally, model says that base image is similar to Jim Parson. You know him as Sheldon Cooper in Big Bang Theory. However, similarity score is 67.05% and it is less than the 70%. Maybe, we should discard distances less than this value.

sefik-looks-alike-jim-parson
Jim Parson

None of them has a similarity score greater than 80% threshold. So, the model just says that these ones are the most similar ones in imdb data set. It has no claim doppelgangers are found.

Let’s change the base image. The following picture has 80.60% similarity to Kevin Spacey. I wish to be a House of Cards character like Frank Underwood.

sefik-looks-like-kevin-spacey
Kevin Spacey

Model says that base image is similar to Jim Parson again. Now, it has 78.61% similarity. The both celebrities have high similarity score this time.

sefik-looks-like-jim-parson-v2
Jim Parson

What do you think? These celebrities are similar to me, or not? Please share your comments.

Large scale similarity search

Notice that we have a source image and find distance between source and each items in our database. In other words, we applied face verification several times. This has O(n) time complexity and this is a problematic for large data sets. Imagine finding celebrity look-alike in real time.

Herein, approximate nearest neighbor (a-nn) algorithm reduces time complexity dramatically. Spotify Annoy, Facebook Faiss, NMSLIB are amazing a-nn libraries. Besides, Elasticsearch wraps NMSLIB and it also comes with a highly scalability.

Real time implementation

You can apply celebrity look-alike face recognition in real time as well. Here you can find the code of real time implementation.

Future work

Reading imdb data set lasts 150 seconds and finding vector representations of imdb data set lasts 11 minutes. These costly tasks are run once in initialization. They are handled in Pandas and you know that Pandas runs on a single CPU core. No matter how strong machine you are working on. These blocks will always last long. Replacing Pandas to Modin or Dask might decrease these initialization tasks radically because of parallelism. On the other hand, finding similarity task lasts seconds. You can test for different images rapidly as is.

Besides, VGG-Face model is used in this post to find vector representations of faces. Facenet might be added configuratively.





Furthermore, we find distance and similarity pair all images in imdb data set. We know that gender has uniform distribution in this data source. So, we can find gender of one’s picture first, filter imdb data set based on the found gender second. This fasten finding celebrity look-alike step twice.

Data source also offers wiki data set. We might blend both imdb and wiki data set to have much more samples.

Contributions and pull requests are welcome!

To Sum Up

I pushed the notebook of celebrity look-alike face recognition project to GitHub. Besides, I plan to adapt finding a painting look-alike just like Google Arts soon.

It is told that everybody has a doppelganger. Is this fact or myth? I will be appreciate if you share your custom testings. Do not forget to mention me on Twitter as @serengil.


Like this blog? Support me on Patreon

Buy me a coffee


6 Comments

  1. Merhabalar Sefik ilk olarak bizimle böyle güzel bilgiler paylaştığın için çok teşekkür ederim.Wiki datasetini kullanmama rağmen görüntüleri okuma işlemim bir saattir sürüyor bu kadar uzun olması normal mi çalışmanda 150 saniye sürdüğünü belirtmissin sana sormak istedim umarım cevap verirsin iyi bayramlar…

    1. Merhaba, çok kuvvetli cpu’ya sahip olan bir makinede denemelerimi yaptım. Local bir laptop’ta denediğiniz için bu süreler sürüyor olabilir.

      1. Cevap verdiğiniz için teşekkür ederim öncelikle .Colab da gpu desteği ile çalışıyorum çok uzun sürdüğü için durdurmak zorunda kaldım.Yaptığınız çalışmanın konusu ile aynı yapmam gereken bir okul projesi var .Bu kısımda takıldığım için ilerleme kaydedemiyorum önereceğiniz bir şey olursa veya yol gösterebileceğiniz çok sevinirim.İyi günler…

        1. Colab çok kullandığım bir platform değil. Her ne kadar gpu desteği sağlasa da cpu core’u olarak limitli çalışma imkanı sağlıyor. Local bilgisayarınızda çalışmanızı öneririm.

  2. Merhabalar, derin ogrenme modelinden sonra neden test train asamasi yapmadik belli bir sebebi var mi acaba?

Comments are closed.