Real Time Facial Expression Recognition on Streaming Data

Previously, we’ve worked on facial expression recognition of a custom image. Additionally, we can detect multiple faces in a image, and then apply same facial expression recognition procedure to these images. As a matter of fact we can do that on a streaming data continuously. These additions can be handled without a huge effort.

Face Detection

The easiest way to detect a face is haar cascade within OpenCV.


๐Ÿ™‹โ€โ™‚๏ธ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

import cv2
face_cascade = cv2.CascadeClassifier('C:/ProgramData/Anaconda3/envs/tensorflow/Library/etc/haarcascades/haarcascade_frontalface_default.xml')
img = cv2.imread('/data/friends.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #transform image to gray scale
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
#print(faces)
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
cv2.imshow('img',img)

There are several face detection solutions. OpenCV offers haar cascade and Single Shot Multibox Detector (SSD). Dlib offers Histogram of Oriented Gradients (HOG) and a CNN based Max-Margin Object Detection (MMOD) and finally Multi-task Cascaded Convolutional Networks (MTCNN) is a common solution for face detection.

Here, you can watch how to use different face detectors in Python.

Streaming Data

What would be if the source were cam instead of a steady image? We can get help from opencv again.

cap = cv2.VideoCapture(0)

while(True):
ret, img = cap.read()

#apply same face detection procedures
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)

for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)

if cv2.waitKey(1) & 0xFF == ord('q'): #press q to quit
break

cap.release()
cv2.destroyAllWindows()

No matter what the source is (steady image or cam), it seems that we can detect faces. Once coordinates of detected faces calculated, we can extract them from the original image. The following code should be put in the faces for iteration. We also need to gray scale and 48×48 resized image to recognize its facial expression based on facial expression recognition requirements.

detected_face = img[int(y):int(y+h), int(x):int(x+w)] #crop detected face
detected_face = cv2.cvtColor(detected_face, cv2.COLOR_BGR2GRAY) #transform to gray scale
detected_face = cv2.resize(detected_face, (48, 48)) #resize to 48x48

Expression Analysis

In previous post, we’ve constructed a model and train it to recognize facial expressions. We would use same pre-constructed model and its pre-trained weights.

from keras.models import model_from_json
model = model_from_json(open("facial_expression_model_structure.json", "r").read())
model.load_weights('facial_expression_model_weights.h5') #load weights

Now, we can classifiy the facial expression of detected faces on a image.

img_pixels = image.img_to_array(detected_face)
img_pixels = np.expand_dims(img_pixels, axis = 0)

img_pixels /= 255

predictions = model.predict(img_pixels)

#find max indexed array
max_index = np.argmax(predictions[0])

emotions = ('angry', 'disgust', 'fear', 'happy', 'sad', 'surprise', 'neutral')
emotion = emotions[max_index]

cv2.putText(img, emotion, (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2)

Evaluation

Applying the both face detection and facial expression recognition procedures on a image seems very successful.





Also, applying same procedures to a video stream data seems very satisfactory.

Besides, we can apply this for web cam streaming. We try to act all emotion candidates.

So, we’ve already recognized facial expressions of human beings. Today, we’ve consumed opencv to process stream data and detect faces on an image. Finally, we’ve merged them and process stream data to detect emotions. Code of the project is pushed to GitHub. Also, you can find the pre-constructed model and pre-trained weights in same repository.

Bonus

You can apply both face recognition and facial attribute analysis including age, gender and emotion in Python with a few lines of code. The all pipeline steps such as face detection, face alignment and analysis are covered in the background.

Deepface is an open source framework for Python. It is available on PyPI as well.


Like this blog? Support me on Patreon

Buy me a coffee


22 Comments

  1. Hello Sefik ๐Ÿ™‚
    I am using your model for emotion predictions but it is not very accurate on my videos. And I wanted to do some fine tuning to it. Is there any advice or a bit of help you could give me as to where do I start? I know this is a very broad question but i’ve been lost for days

    1. Hello,

      First of all, this approach is not the best but it is the fastest. You might prefer to adopt this approach in real time applications.

      On the other hand, you can improve the accuracy. For example, I applied Auto-ML for the same data set and accuracy improved from 57% to 66%. However, the structure of the network becomes so complex. There are 124 convolution layers existing in that model. Remember that the model mentioned in this post has 5 convolution layers. This means more accurate model is almost 25 times complexer than the regular model. You can find the Auto-ML post here: https://sefiks.com/2019/04/08/a-gentle-introduction-to-auto-keras/

      Alternatively, you can build your own model by applying transfer learning. Herein, popular models can be adapted such as VGG or Inception. I did similar task for age and gender prediction. You should adapt this approach for emotion analysis task. Transfer learning for custom task post can be found here: https://sefiks.com/2019/02/13/apparent-age-and-gender-prediction-in-keras/

      I hope that these comments would help you.

  2. Hi Sefik,
    i want to use your model for expression detection in my project at school but im having an issue with it. Im running this on a mac
    at line 2 below importing cv2
    face_cascade = cv2.CascadeClassifier(‘C:/ProgramData/Anaconda3/envs/tensorflow/Library/etc/haarcascades/haarcascade_frontalface_default.xml’)

    what is this path that you are passing onto CascadeClassifier here? There’s no such path on my machine and i tried to look around all the possible things i could

    your help here would be really appreciated

    1. I will implement it but to be honest this autokeras model is too complex to run in real time.

  3. Hi Sefik!
    Thank you so much for this guide! I try to run real time prediction of age and gender on Raspberry Pi 4 4GB. Although real time prediction of emotions works good, When I try to run real time prediction of age and gender, there is only one picture per 5 seconds, which is extremely disappointing.
    Do you know any ways to simplify the model? I think, one picture per 1 second will be enough. In this case, time is more important than accuracy.

    1. Age and gender prediction model uses VGG and it is really complex model for real time. I freeze the image when face detected for 5 seconds in my custom experiences.

      1. Is it possible to reduce the number of layers on this model or to reduce the delay (possibly with reduce of accuracy) on this model?

  4. Hi Sefik! Thanks again for your project!
    Do you know why the program for emotions recognition detects people from more than 7 meters distance from a camera, while the program for age-gender prediction detects people who are not more than only 3 meters away from a camera?

    1. It might be related to the source data sets. Some people should mark source image labels (e.g. happy or sad for emotion) then ML algorithm would learn some outcomes from these images. Herein, people can label source images in that distance.

  5. FileNotFoundError: [Errno 2] No such file or directory: ‘facial_expression_model_structure.json’

    I am receiving this error

  6. hey Sefik, I’m a beginner in computer vision and I want to know how can I use SSD with this method

      1. I’ve already read it, and yet it doesn’t work. Can you do a tutorial applying SSD with face expressions ?

  7. Hello, I would like to use this model for real time and would definitely need an increase in accuracy. Would using less emotions (happy, sad, neutral) increase accuracy? If so, how should I modify the files?

Comments are closed.