We’ve mentioned how to predict the identity, emotion, age and gender with deep learning in previous posts. Ethnicity and race are facial attributes as well similar to previous ones and we can predict it, too. Recognizing ethnicity from face photos could contribute a huge contribution to missing children, search investigations, refugee crisis and genealogy research. We’ve previously mentioned the ethnicity prediction topic in the perspective of AI Ethics.

Data set
I’ve found two different public data sets including ethnicity labeled face pictures.
🙋♂️ You may consider to enroll my top-rated machine learning course on Udemy

The first one is FairFace. This one is a large scale data set and it consists of 86K train and 11K test instances. Its labels are East Asian, Southeast Asian, Indian, Black, White, Middle-Eastern and Latino-Hispanic. Merging both east and southeast Asian races into a single Asian race would be better.
train_df = pd.read_csv("fairface_label_train.csv") test_df = pd.read_csv("fairface_label_val.csv")
The second one is UTKFace. This one is a small scale data set. It has 10K instances. Besides, its labels are Asian, Indian, Black, White and Others (Latino and Middle Eastern).
Merging two data sets increased the accuracy in my experiments from 68% to 72% but I had to replace Latino and Middle Eastern races to Others. In other words, UTKFace would not increase the accuracy as expected. That’s why, I prefer to train my model with just FairFace data set.
Ethnicity distribution
The number of instances for each race is homogeneous in FairFace data set.
100*train_df.groupby(['race']).count()[['file']]/train_df.groupby(['race']).count()[['file']].sum()

I’ve merged two Asian races into a single Asian race.
idx = train_df[(train_df['race'] == 'East Asian') | (train_df['race'] == 'Southeast Asian')].index train_df.loc[idx, 'race'] = 'Asian' idx = test_df[(test_df['race'] == 'East Asian') | (test_df['race'] == 'Southeast Asian')].index test_df.loc[idx, 'race'] = 'Asian'
Thus, distribution becomes as illustrated below after data manipulations.

Reading image pixels
The original data set includes just base image names and its race.

We will read image pixels based on the file names.
target_size = (224, 224) def getImagePixels(file): img = image.load_img(file, grayscale=False, target_size=target_size) x = image.img_to_array(img).reshape(1, -1)[0] return x train_df['pixels'] = train_df['file'].progress_apply(getImagePixels) test_df['pixels'] = test_df['file'].progress_apply(getImagePixels)
Now, images pixels are stored as a column

Input features
Pixels are stored as a list. We need to reshape each line to (224, 224, 3). Besides, inputs should be normalized in neural networks because of activation functions. This is going to be input feature we will pass to the network as input.
train_features = []; test_features = [] for i in range(0, train_df.shape[0]): train_features.append(train_df['pixels'].values[i]) for i in range(0, test_df.shape[0]): test_features.append(test_df['pixels'].values[i]) train_features = np.array(train_features) train_features = train_features.reshape(train_features.shape[0], 224, 224, 3) test_features = np.array(test_features) test_features = test_features.reshape(test_features.shape[0], 224, 224, 3) train_features = train_features / 255 test_features = test_features / 255
Target
Race column is the target value we will predict. However, we need to apply it to one hot encoding. Network will have 6 outputs – this is the number of races in the data set.
train_label = train_df[['race']] test_label = test_df[['race']] races = train_df['race'].unique() for j in range(len(races)): #label encoding current_race = races[j] print("replacing ",current_race," to ", j+1) train_label['race'] = train_label['race'].replace(current_race, str(j+1)) test_label['race'] = test_label['race'].replace(current_race, str(j+1)) train_label = train_label.astype({'race': 'int32'}) test_label = test_label.astype({'race': 'int32'}) train_target = pd.get_dummies(train_label['race'], prefix='race') test_target = pd.get_dummies(test_label['race'], prefix='race')
Train and validation split
Train and test sets are separate. We will predict the test set at the end of this study. We should split train set into train and validation to avoid overfitting. In this way, we can apply early stopping.
train_x, val_x, train_y, val_y = train_test_split( train_features, train_target.values , test_size=0.12, random_state=17 )
Base Model
We will use VGG-Face for transfer learning. Let’s construct it first.
model = Sequential() model.add(ZeroPadding2D((1,1),input_shape=(224,224, 3))) model.add(Convolution2D(64, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D((2,2), strides=(2,2))) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(128, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(128, (3, 3), activation='relu')) model.add(MaxPooling2D((2,2), strides=(2,2))) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(256, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(256, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(256, (3, 3), activation='relu')) model.add(MaxPooling2D((2,2), strides=(2,2))) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(512, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(512, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(512, (3, 3), activation='relu')) model.add(MaxPooling2D((2,2), strides=(2,2))) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(512, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(512, (3, 3), activation='relu')) model.add(ZeroPadding2D((1,1))) model.add(Convolution2D(512, (3, 3), activation='relu')) model.add(MaxPooling2D((2,2), strides=(2,2))) model.add(Convolution2D(4096, (7, 7), activation='relu')) model.add(Dropout(0.5)) model.add(Convolution2D(4096, (1, 1), activation='relu')) model.add(Dropout(0.5)) model.add(Convolution2D(2622, (1, 1))) model.add(Flatten()) model.add(Activation('softmax')) #related blog post: https://sefiks.com/2018/08/06/deep-face-recognition-with-keras/ model.load_weights('vgg_face_weights.h5')
Transfer Learning
Its early layers can detect some facial patterns already. We do not have to train it from scratch. Because we do not have millions of train set instances. We can lock its early layers and expect the late layers to learn.
for layer in model.layers[:-7]: layer.trainable = False
In this way, its all layers except the last 7 one are locked and its weights will not be updated. We expect its last 7 layers to learn something.
The original VGG-Face network has 2622 outputs but here we need just 6 outputs related to races. We will customize the VGG-Face here and it is going to be VGG-Race now.
base_model_output = Sequential() base_model_output = Convolution2D(num_of_classes, (1, 1), name='predictions')(model.layers[-4].output) base_model_output = Flatten()(base_model_output) base_model_output = Activation('softmax')(base_model_output) race_model = Model(inputs=model.input, outputs=base_model_output)
Training
Instead of feeding all train data, I prefer to feed it as batches. I got the best result for 16.384 (2^14) batch size. I feed randomly selected 16K instances in every epoch. If validation loss would not decrease for 50 rounds, then training should be terminated to avoid overfitting.
race_model.compile(loss='categorical_crossentropy' , optimizer=keras.optimizers.Adam(), metrics=['accuracy']) checkpointer = ModelCheckpoint(filepath='race_model_single_batch.hdf5' , monitor = "val_loss", verbose=1, save_best_only=True, mode = 'auto') batch_size = pow(2, 14); patience = 50 last_improvement = 0; best_iteration = 0 loss = 1000000 #initialize as a large value for i in range(0, epochs): print("Epoch ", i, ". ", end='') ix_train = np.random.choice(train_x.shape[0], size=batch_size) score = race_model.fit( train_x[ix_train], train_y[ix_train] , epochs=1 , validation_data=(val_x, val_y) , callbacks=[checkpointer] ) val_loss = score.history['val_loss'][0]; train_loss = score.history['loss'][0] val_scores.append(val_loss); train_scores.append(train_loss) if val_loss < loss: loss = val_loss * 1 last_improvement = 0 best_iteration = i * 1 else: last_improvement = last_improvement + 1 print("try to decrease val loss for ",patience - last_improvement," epochs more") if last_improvement == patience: print("there is no loss decrease in validation for ",patience," epochs. early stopped") break
Loss
The best epoch was 29. I train the network for 80 rounds but train loss decreased while validation loss increased when epoch > 30 in the following steps. That’s exactly overfitting.
plt.plot(val_scores[0:best_iteration+1], label='val_loss') plt.plot(train_scores[0:best_iteration+1], label='train_loss') plt.legend(loc='upper right') plt.show()

That’s why, I loaded the weights for the best iteration
from keras.models import load_model race_model = load_model("race_model_single_batch.hdf5") race_model.save_weights('race_model_single_batch.h5')
Evaluation
We train the network with train data set and use validation set to apply early stop. Epoch is the best iteration for validation set actually. However, network could memorize the validation set and it could still be overfitted. That’s why, we haven’t feed test set to the network yet. We expect that test and validation loss should be close if the model is robust.
test_perf = race_model.evaluate(test_features, test_target.values, verbose=1) print(test_perf) validation_perf = race_model.evaluate(val_x, val_y, verbose=1) print(validation_perf) abs(validation_perf[0] - test_perf[0])
The both test and validation loss are 0.88 and accuracy are 68%. We can say that the model is robust.
Prediction
We can make predictions for the test set.
predictions = race_model.predict(test_features)
Also, we can print prediction and actual values and plot the original image as well.
predictions = race_model.predict(test_features) for i in range(0, predictions.shape[0]): prediction = np.argmax(predictions[i]) prediction_classes.append(races[prediction]) actual = np.argmax(test_target.values[i]) actual_classes.append(races[actual]) if i == 10: print("Actual: ",races[actual]) print("Predicted: ",races[prediction]) img = (test_df.iloc[i]['pixels'].reshape([224, 224, 3])) / 255 plt.imshow(img); plt.show()

Confusion matrix
Accuracy does not mean anything for classification problems. We need precision and recall values. Confusion matrix is the best way to monitor the success of your model.
from sklearn.metrics import classification_report, confusion_matrix import seaborn as sn cm = confusion_matrix(actual_classes, prediction_classes) df_cm = pd.DataFrame(cm, index=races, columns=races) sn.heatmap(df_cm, annot=True,annot_kws={"size": 10})
The following heat map explains everything.

Predicting custom images
We can predict the ethnicity for custom images as well.
demo_set = ['fei-fei-li.jpg', 'sundar-pichai.jpg', 'obama.jpg', 'katy.jpg'] for file in demo_set: path = 'demo/%s' % (file) img = image.load_img(path, grayscale=False, target_size=target_size) img = image.img_to_array(img).reshape(1, -1)[0] img = img.reshape(224, 224, 3) img = img / 255 plt.imshow(img) plt.show() img = np.expand_dims(img, axis=0) prediction_proba = race_model.predict(img) print("Prediction: ",races[np.argmax(prediction_proba)]) print("---------------------------")
I’ve applied prediction for the characters of Silicon Valley. Results are really satisfactory.

Loading pre-trained network
I shared the pre-trained network weights on Google Drive. You can skip training step and load the weight when our race model is built.
race_model.load_weights('race_model_weights_full_v2.h5')
Real Time Ethnicity Prediction
We can apply race prediction in real time as well. Its source code is pushed to GitHub already. Additionally OpenCV’s haar cascade module detects the face and we pass the detected face to the model.
BTW, have you subscribe my youtube channel 🙂
Conclusion
So, we’ve mentioned how to build a race and ethnicity classifier from scratch in this post. I pushed the source code of this post as a notebook to GitHub. Besides, its real time implementation code is pushed to GithHub, too. Pre-trained network weights are shared to Google Drive because of its size. There are many ways to support a project – starring the GitHub repo is just one.
Python library
Herein, deepface is a lightweight facial analysis framework covering both face recognition and demography such as age, gender, race and emotion. If you are not interested in building neural networks models from scratch, then you might adopt deepface. It is fully open-source and available on PyPI. You can make predictions with a few lines of code.

Here, you can watch a how to apply facial attribute analysis in python with a just few lines of code.
Real time implementation
Real time facial attribute analysis is available in DeepFace
Also, deepface offers an ui built with react js for real time applications.
Anti-Spoofing and Liveness Detection
What if DeepFace is given fake or spoofed images? This becomes a serious issue if it is used in a security system. To address this, DeepFace includes an anti-spoofing feature for face verification or liveness detection.
Support this blog if you do like!
link of download dataset is not working, can you please modify link, & also is there any new release of model?