Apparent Age and Gender Prediction in Keras

Computer vision researchers of ETH Zurich University (Switzerland) announced a very successful apparent age and gender prediction models. They both shared how they designed the machine learning model and pre-trained weights for transfer learning. Their implementation was based on Caffe framework. Even though I tried to convert Caffe model and weights to Keras / TensorFlow, I couldn’t handle this. That’s why, I intend to adopt this research from scratch in Keras.

katy-perry-ages-v2
Katy Perry Transformation

What this post offers?

We can apply age and gender predictions in real time.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

DeepFace library for Python covers age prediction. You can run age estimation with a few lines of code.

Pre-trained model

In this post, we are going to re-train the age and gender prediction models from scratch. If you focus on just prediction stage, then the following video might attract your attention. This subject is covered in a dedicated blog post actually: Age and Gender Prediction with Deep Learning in OpenCV. We will use pre-trained models for Caffe within OpenCV in this case. Besides, you don’t have to have Caffe on your environment. OpenCV handles to build Caffe models with its dnn module.

On the other hand, if training stage attracts your attention, then you should continue to read this blog post.

Dataset

The original work consumed face pictures collected from IMDB (7 GB) and Wikipedia (1 GB). You can find these data sets here. In this post, I will just consume wiki data source to develop solution fast. You should download faces only files.

Extracting wiki_crop.tar creates 100 folders and an index file (wiki.mat). The index file is saved as Matlab format. We can read Matlab files in python with SciPy.

import scipy.io
mat = scipy.io.loadmat('wiki_crop/wiki.mat')

Converting pandas dataframe will make transformations easier.

instances = mat['wiki'][0][0][0].shape[1]

columns = ["dob", "photo_taken", "full_path", "gender", "name", "face_location", "face_score", "second_face_score"]

import pandas as pd
df = pd.DataFrame(index = range(0,instances), columns = columns)

for i in mat:
if i == "wiki":
current_array = mat[i][0][0]
for j in range(len(current_array)):
df[columns[j]] = pd.DataFrame(current_array[j][0])

wiki-crop-dataset
Initial data set

Data set contains date of birth (dob) in Matlab datenum format. We need to convert this to Python datatime format. We just need the birth year.

from datetime import datetime, timedelta
def datenum_to_datetime(datenum):
days = datenum % 1
hours = days % 1 * 24
minutes = hours % 1 * 60
seconds = minutes % 1 * 60
exact_date = datetime.fromordinal(int(datenum)) \
+ timedelta(days=int(days)) + timedelta(hours=int(hours)) \
+ timedelta(minutes=int(minutes)) + timedelta(seconds=round(seconds)) \
- timedelta(days=366)

return exact_date.year

df['date_of_birth'] = df['dob'].apply(datenum_to_datetime)
wiki-crop-dataset-dob
Adding exact birth date

Extracting date of birth from matlab datenum format





Now, we have both date of birth and photo taken time. Subtracting these values will give us the ages.

df['age'] = df['photo_taken'] - df['date_of_birth']

Data cleaning

Some pictures don’t include people in the wiki data set. For example, a vase picture exists in the data set. Moreover, some pictures might include two person. Furthermore, some are taken distant. Face score value can help us to understand the picture is clear or not. Also, age information is missing for some records. They all might confuse the model. We should ignore them. Finally, unnecessary columns should be dropped to occupy less memory.

#remove pictures does not include face
df = df[df['face_score'] != -np.inf]

#some pictures include more than one face, remove them
df = df[df['second_face_score'].isna()]

#check threshold
df = df[df['face_score'] >= 3]

#some records do not have a gender information
df = df[~df['gender'].isna()]

df = df.drop(columns = ['name','face_score','second_face_score','date_of_birth','face_location'])

Some pictures are taken for unborn people. Age value seems to be negative for some records. Dirty data might cause this. Moreover, some seems to be alive for more than 100. We should restrict the age prediction problem for 0 to 100 years.

#some guys seem to be greater than 100. some of these are paintings. remove these old guys
df = df[df['age'] <= 100]

#some guys seem to be unborn in the data set
df = df[df['age'] > 0]

The raw data set will be look like the following data frame.

wiki-crop-dataset-raw
Raw data set

We can visualize the target label distribution.

histogram_age = df['age'].hist(bins=df['age'].nunique())
histogram_gender = df['gender'].hist(bins=df['gender'].nunique())
age-gender-distribution
Age and gender distribution in the data set

Full path column states the exact location of the picture on the disk. We need its pixel values.

target_size = (224, 224)

def getImagePixels(image_path):
img = image.load_img("wiki_crop/%s" % image_path[0], grayscale=False, target_size=target_size)
x = image.img_to_array(img).reshape(1, -1)[0]
#x = preprocess_input(x)
return x

df['pixels'] = df['full_path'].apply(getImagePixels)

We can extract the real pixel values of pictures

wiki-crop-dataset-pixels

Adding pixels

Apparent age prediction model

Age prediction is a regression problem. But researchers define it as a classification problem. There are 101 classes in the output layer for ages 0 to 100. they applied transfer learning for this duty. Their choice was VGG for imagenet.

Preparing input output

Pandas data frame includes both input and output information for age and gender prediction tasks. Wee should just focus on the age task.

classes = 101 #0 to 100
target = df['age'].values
target_classes = keras.utils.to_categorical(target, classes)

features = []

for i in range(0, df.shape[0]):
features.append(df['pixels'].values[i])

features = np.array(features)
features = features.reshape(features.shape[0], 224, 224, 3)

Also, we need to split data set as training and testing set.





from sklearn.model_selection import train_test_split
train_x, test_x, train_y, test_y = train_test_split(features, target_classes, test_size=0.30)

The final data set consists of 22578 instances. It is splitted into 15905 train instances and 6673 test instances .

Transfer learning

As mentioned, researcher used VGG imagenet model. Still, they tuned weights for this data set. Herein, I prefer to use VGG-Face model. Because, this model is tuned for face recognition task. In this way, we might have outcomes for patterns in the human face.

#VGG-Face model
model = Sequential()
model.add(ZeroPadding2D((1,1),input_shape=(224,224, 3)))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))

model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))

model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))

model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))

model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))

model.add(Convolution2D(4096, (7, 7), activation='relu'))
model.add(Dropout(0.5))
model.add(Convolution2D(4096, (1, 1), activation='relu'))
model.add(Dropout(0.5))
model.add(Convolution2D(2622, (1, 1)))
model.add(Flatten())
model.add(Activation('softmax'))

Load the pre-trained weights for VGG-Face model. You can find the related blog post here.

#pre-trained weights of vgg-face model.
#you can find it here: https://github.com/serengil/deepface_models/releases/download/v1.0/vgg_face_weights.h5
#related blog post: https://sefiks.com/2018/08/06/deep-face-recognition-with-keras/
model.load_weights('vgg_face_weights.h5')

We should lock the layer weights for early layers because they could already detect some patterns. Fitting the network from scratch might cause to lose this important information. I prefer to freeze all layers except last 3 convolution layers (make exception for last 7 model.add units). Also, I cut the last convolution layer because it has 2622 units. I need just 101 (ages from 0 to 100) units for age prediction task. Then, add a custom convolution layer consisting of 101 units.

for layer in model.layers[:-7]:
layer.trainable = False

base_model_output = Sequential()
base_model_output = Convolution2D(101, (1, 1), name='predictions')(model.layers[-4].output)
base_model_output = Flatten()(base_model_output)
base_model_output = Activation('softmax')(base_model_output)

age_model = Model(inputs=model.input, outputs=base_model_output)

Training

This is a multi-class classification problem. Loss function must be categorical crossentropy. Optimization algorithm will be Adam to converge loss faster. I create a checkpoint to monitor model over iterations and avoid overfitting. The iteration which has the minimum validation loss value will include the optimum weights. That’s why, I’ll monitor validation loss and save the best one only.

To avoid overfitting, I feed random 256 instances for each epoch.

age_model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])

checkpointer = ModelCheckpoint(filepath='age_model.hdf5'
, monitor = "val_loss", verbose=1, save_best_only=True, mode = 'auto')

scores = []
epochs = 250; batch_size = 256

for i in range(epochs):
print("epoch ",i)

ix_train = np.random.choice(train_x.shape[0], size=batch_size)

score = age_model.fit(train_x[ix_train], train_y[ix_train]
, epochs=1, validation_data=(test_x, test_y), callbacks=[checkpointer])

scores.append(score)

It seems that validation loss reach the minimum. Increasing epochs will cause to overfitting.

age-prediction-loss-v2
Loss for age prediction task

Model evaluation on test set

We can evaluate the final model on the test set.

age_model.evaluate(test_x, test_y, verbose=1)

This gives both validation loss and accuracy respectively for 6673 test instances. It seems that we have the following results.

[2.871919590848929, 0.24298789490543357]





24% accuracy seems very low, right? Actually, it is not. Herein, researchers develop an age prediction approach and convert classification task to regression. They propose that you should multiply each softmax out with its label. Summing this multiplications will be the apparent age prediction.

age-prediction-approach
Age prediction approach

This is a very easy operation in Python numpy.

predictions = age_model.predict(test_x)

output_indexes = np.array([i for i in range(0, 101)])
apparent_predictions = np.sum(predictions * output_indexes, axis = 1)

Herein, mean absolute error metric might be more meaningful to evaluate the system.

mae = 0

for i in range(0 ,apparent_predictions.shape[0]):
prediction = int(apparent_predictions[i])
actual = np.argmax(test_y[i])

abs_error = abs(prediction - actual)
actual_mean = actual_mean + actual

mae = mae + abs_error

mae = mae / apparent_predictions.shape[0]

print("mae: ",mae)
print("instances: ",apparent_predictions.shape[0])

Our apparent age prediction model averagely predict ages ± 4.65 error. This is acceptable.

Testing model on custom images

We can feel the power of the model when we feed custom images into it.

from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator

def loadImage(filepath):
test_img = image.load_img(filepath, target_size=(224, 224))
test_img = image.img_to_array(test_img)
test_img = np.expand_dims(test_img, axis = 0)
test_img /= 255
return test_img

picture = "marlon-brando.jpg"
prediction = age_model.predict(loadImage(picture))

Prediction variable stores distribution for each age class. Monitoring it might be intersting.

y_pos = np.arange(101)
plt.bar(y_pos, prediction[0], align='center', alpha=0.3)
plt.ylabel('percentage')
plt.title('age')
plt.show()

This is the age prediction distribution of Marlon Brando in Godfather. The most dominant age class is 44 whereas weighted age is 48 which is the exact age of him in 1972.

age-prediction-distribution
Age prediction distribution for Marlon Brando in Godfather

We’ll calculate apparent age from these age distributions

img = image.load_img(picture)
plt.imshow(img)
plt.show()

print("most dominant age class (not apparent age): ",np.argmax(prediction))

apparent_age = np.round(np.sum(prediction * output_indexes, axis = 1))
print("apparent age: ", int(apparent_age[0]))

Results are very satisfactory even though it does not have a good perspective. Marlon Brando was 48 and Al Pacino was 32 in Godfather Part I.

age-prediction-for-godfather-v2
Apparent Age Prediction in Godfather

Compare to original study

As I mentioned before, we re-trained the base model because the original study is mainly based on Caffe and I need pre-trained weights for Keras. The original study was the winner of the ChaLearn Looking at People (LAP) challenge on Apparent age V1 (ICCV ’15).





You are expected to predict the age of someone and there are several predictions of his/her age instead of actual age. So, your predictions will be evaluated by the mean and standard deviation the the jury predictions.

Evaluation formula

If your prediction is equal to the mean of the predictions, then error becomes 0. Besides, if your prediction is not close to the mean of the predictions but the standard deviation of jury predictions are high, then the error closes to 0 as well. On the other hand, you will be fined if your prediction is not close to the mean of predictions and the standard deviation of the jury predictions is low as well.

from math import e
df['epsilon'] = e ** ( -1*( (df['prediction'] - df['mean_age']) ** 2 ) / (2*(df['std_age']**2)) )
df['epsilon'].mean()

The ε value of this model is 0.387378, and MAE is 7.887859 for 1079 instances. On the other hand, the ε value of the original study was 0.264975. They declared that human reference of ε was 0.34. So, the original study is still little bit more accurate than the model I created in this post. Besides, my model is close to the human level for age prediction.

You can find the evaluation test data set and its labels here

Face detection

Train set images are already cropped and just facial areas are mentioned. Testing  a custom image requires to detect faces. This will increase the accuracy dramatically. Besides, face alignment is not a must but it is a plus for this study.

There are several face detection solutions. OpenCV offers haar cascade and single shot multibox detector (SSD). Dlib offers Histogram of Oriented Gradients (HOG) and Max-Margin Object Detection (MMOD). Finally Multi-task Cascaded Convolutional Networks (MTCNN) is a common solution for face detection. Herein, haar cascade and HoG are legacy methods whereas SSD, MMOD and MTCNN are deep learning based modern solutions. You can see the face detection performance of those model in the following video.

Here, you can also see how to run those different face detectors in a single line of code with deepface framework for python.

You can find out the math behind face alignment more on the following video:

Face detectors extract faces in a rectangle area. So, it comes with a noise such as background color. Here, we can find 68 landmarks of a facial image with dlib

Here, retinaface is the cutting-edge face detection technology. It can even detect faces in the crowd and it finds facial landmarks including eye coordinates. That’s why, its alignment score is very high.





Gender prediction model

Apparent age prediction was a challenging problem. However, gender prediction is much more predictable.

We’ll apply binary encoding to target gender class.

target = df['gender'].values
target_classes = keras.utils.to_categorical(target, 2)

We then just need to put 2 classes in the output layer for man and woman.

for layer in model.layers[:-7]:
layer.trainable = False

base_model_output = Sequential()
base_model_output = Convolution2D(2, (1, 1), name='predictions')(model.layers[-4].output)
base_model_output = Flatten()(base_model_output)
base_model_output = Activation('softmax')(base_model_output)

gender_model = Model(inputs=model.input, outputs=base_model_output)

Now, the model is ready to fit.

scores = []
epochs = 250; batch_size = 256

for i in range(epochs):
print("epoch ",i)

ix_train = np.random.choice(train_x.shape[0], size=batch_size)

score = gender_model.fit(train_x[ix_train], train_y[ix_train]
, epochs=1, validation_data=(test_x, test_y), callbacks=[checkpointer])

scores.append(score)

It seems that the model is saturated. Terminating training will be clever.

loss-for-gender-v2
Loss for gender prediction

Evaluation

gender_model.evaluate(test_x, test_y, verbose=1)

The model has the following validation loss and accuracy. It is really satisfactory.

[0.07324957040103375, 0.9744245524655362]

Confusion matrix

This is a real classification problem instead of age prediction. The accuracy should not be the only metric we need to monitor. Precision and recall should also be checked.

from sklearn.metrics import classification_report, confusion_matrix

predictions = gender_model.predict(test_x)

pred_list = []; actual_list = []

for i in predictions:
pred_list.append(np.argmax(i))

for i in test_y:
actual_list.append(np.argmax(i))

confusion_matrix(actual_list, pred_list)

The model generates the following confusion matrix. Columns are prediction whereas rows are actual value labels.

| Female | Male |
Female | 1873 | 98 |
Male | 72 | 4604 |





This means that we have 96.29% precision, 95.05% recall. These metrics are as satisfactory as the accuracy.

Testing gender for custom images

We just need to feed images to the model.

picture = "katy-perry.jpg"
prediction = gender_model.predict(loadImage(picture))

img = image.load_img(picture)#, target_size=(224, 224))
plt.imshow(img)
plt.show()
gender = "Male" if np.argmax(prediction) == 1 else "Female"
print("gender: ", gender)

Conclusion

So, we’ve built an apparent age and gender predictors from scratch based on the research article of computer vision group of ETH Zurich. In particular, the way they proposed to calculate apparent age is an over-performing novel method. Deep learning really has a limitless power for learning.

I pushed the source code for both apparent age prediction and gender prediction to GitHub. Similarly, real time age and gender prediction implementation is pushed here. You might want to just use pre-trained weights. I put pre-trained weights for age and gender tasks to Google Drive.

Python library

Herein, deepface is a lightweight facial analysis framework covering both face recognition and demography such as age, gender, race and emotion. If you are not interested in building neural networks models from scratch, then you might adopt deepface. It is fully open-source and available on PyPI. You can make predictions with a few lines of code.

deepface-analysis
Deep Face Analysis

Here, you can watch how to apply facial attribute analysis in python with a just few lines of code.

You can run deepface in real time with your web cam as well.

Meanwhile, you can run face verification tasks directly in your browser with its custom ui built with ReactJS.

Also, deepface has its React JS ui for facial attribute analysis purposes.





Anti-Spoofing and Liveness Detection

What if DeepFace is given fake or spoofed images? This becomes a serious issue if it is used in a security system. To address this, DeepFace includes an anti-spoofing feature for face verification or liveness detection.


Support this blog if you do like!

Buy me a coffee      Buy me a coffee


83 Comments

  1. Great article!
    I tried the code and the results are really good.
    I have verified, however, that even by submitting an image that does not contain a face, the model returns a prediction of age and gender. Is there a way to detect images where there is no face?

    1. First of all, thank you for your feedback.

      Yes, you are right. Current implementation always returns a prediction. You can detect face first and apply age/gender prediction to detected faces. I have used OpenCV’s haarcascade module to detect faces. You can find a similar implementation here: https://sefiks.com/2018/01/10/real-time-facial-expression-recognition-on-streaming-data/ . In this case, I detect faces and apply emotion prediction instead of age/gender. I think you can easily adapt to this problem.

      1. Dear Sefik,
        thanks for your kind reply.
        I will try your example immediately.

  2. thanks Sefik!
    I’m doing tests and everything seems to work very well!
    I wanted to ask you if you have any suggestions to increase the recall in the face recognition process and in the age detection process.
    I am currently using the pre-trained Wikipedia-based model. Do you think that if I used IMDB (7GB) the recall of the age detection process would improve?

    1. I actually trained the model with both imdb and wiki data set. Currently, the model can predict with error plus minus 4 ages.

      You might re-train with an alternative model such as inception v3 or regular vgg to increase the accuracy.

      1. Hi Sefik, when i used the imdb dataset, the jupyter notebook and colab(with GPU) crashes whenever i loop through the observations to get the pixels. My guess is that its a memory issue. do you know any way i can get around it?

        1. Even though colab offers you GPU, it has a limited memory. I’ve run this study on my local environment and I do not have a memory problem. I recommend you to decrease the size of the data set.

      1. That post directs you a github repo. Could you try the iPython notebook in there?

  3. I followed the same steps, but got an accuracy of 0.5% [3.497109861968012, 0.057720696795011364], kindly help me

    1. This is for gender prediction? If so, 1st index value is greater than 2nd index. This means gender would be woman.

  4. This for Age prediction i followed the same github https://github.com/serengil/tensorflow-101/blob/master/python/apparent_age_prediction.ipynb.
    After the model is trained when model is evaluated on test set the github page says accuracy is
    6774/6774 [==============================] – 17s 2ms/step
    Out[102]:
    [2.871919590848929, 0.24298789490543357].

    For me it is 6774/6774 [==============================] – 28s 4ms/step
    [3.493283847809401, 0.056834957193652544].

    I really love your tutorial, please help me to achieve the same accuracy.

    1. Oke I understand what you mean right now. First of all, I trained the model with both wiki and imdb data set. The more data brings the more accurcy. Secondly, you cannot get same accuracy because of random initialization. Why you need to get same accuracy? You might use same pre-trained weight and by-pass traininig.

  5. Sorry i mean you got 24%, whereas i got only 0.5%. So you mean i should also train IMDB dataset along with Wiki dataset? for better accuracy?

    1. You should focus on loss value because we will not approach this problem as classification problem. The following steps will calculate weighted ages by multiplying age label and its probability.

  6. Firstly, thanks for this guide. I have a question, how could we predict first some age groups and then the age for improving the model?

    1. Suppose that you want to classify ages in 3 classes: young, middle age and old. You can add the following code block after the 14th block in https://github.com/serengil/tensorflow-101/blob/master/python/apparent_age_prediction.ipynb

      #add after 14th block
      df[df[df[‘age’] >= 50].index, ‘age_class’] = ‘old’
      df[df[(df[‘age’] < 50) & (df['age'] >= 30)].index, ‘age_class’] = ‘middle_age’
      df[df[df[‘age’] < 30].index, 'age_class'] = 'young' df['age'] = df['age_class'] df = df.drop(columns=['age_class']) But I do not know this approach can reach the accuracy level of the already implemented one.

  7. I am trying to implement this in resnet50. However, the model does not predict well on older people. How could I increase this?

    I tried to retrain multiple times with different datasets, But when I do this, the MAE and loss are not going to be improved it starts from different higher values then the first network was trained on. It looks that the learned weights are forgotten.

    1. 1- Do you freeze the early layers with trainable = False command?
      2- Ignore mae and loss because we finally calculate the weighted ages. I mean that we would not get the highest score age, each age score will be multiplied with its label.
      3- Resnet50 is designed for object recognition. But Face version of VGG is designed for face recognition. This means that model detects face oriented features in early layer. That’s why, VGG would overperform most probably.

      1. 1. I freeze all layers
        2. For evaluating, I checked on MAE on different datasets. For some, I achieve between 4,80 – 7.50 MAE. However older People around 70 old are years are predicted mostly around 50-60.
        3. Good to know, didnt knew it.

        Is it normal that age estimation works better at younger people than 60+ ages?

        1. Please freeze just early layers. I mostly freeze all layers except the last 4 convolution layer. In this way, pre-trained model detect some patterns in early layers and my fully connected layers can find relation between detected patterns and my custom problem (in this case age and gender)

          1. for layer in model.layers[:-7]:
            layer.trainable = False
            The above is some thing like freezing from 7 to prediction layers.

            Wondering is this right to freeze the layers. I think it should be as below to freeze the first few layers (from input to 7 layers)
            for layer in model.layers[:7]:
            layer.trainable = False

            is my understanding wrong?

          2. model.layers[:-7]: means freeze all layers except last 7. I mean 7 on the right. Early layers freezed.

  8. Thanks, the tutorial is pretty nice but I am having memory issues. I tried to train the model on Google colab too without using the weights but just after 5 iterations of batch size 128 and that too from first 1000 images the GPU memory runs out. I wanted to ask if you implemented some other code for memory runout???

      1. Yes, I have kept it to 5 epochs and batch_size to 128 but the memory seems to overflow

          1. I have a doubt the load_model and save_weights line in the github repo for age prediction is outside the loop while for gender prediction it is inside the loop.
            Keeping the lines outside the loop will too restore best weights or it should be inside only?? Because, I think it is creating the memory problem…

  9. Thanks, this seems to solve the memory issue but did you train for all wiki images at once?

    1. Nope, I fed as batches. ix_train stores imdb images length of batch size. Every iteration it is stored randomly in all imdb data set.

      1. ok while storing in np array above the training the memory runs out…
        During this line:
        df[‘pixels’] = df[‘full_path’].apply(getImagePixels)
        Here, I am feeding only 2000 images of the dataset which seems to be hindering the accuracy and the loss because of insufficient amount of images.
        Did you load the complete dataset?

  10. Dataframe stores the path right? And then we load it using np amd store pixel values but these values and seems to cause memory error if I use all of the images… Hence, I am using only the first 2000 images and am getting 5% accuracy…

    1. Raw data frame stores path but then we load it as pixels. You might reload pixels in the for loop to reduce memory allocation. Because data frame size is low without pixels.

      1. just changing enablefit to true is enough?
        what about agemodel (hdf5) ? should i download it or it will be created on process(if it doesn’t exist already)?

        1. You must download and put hdf5 file in the same directory, then set enableFit param to true. Reference link of weight file exists in the post.

          1. hdf5 is a checkpoint model file. right?
            I have already downloaded weights (h5 files)
            I am hoping it will create a checkpoint model on the process, am I wrong?

          2. Correction: it should be h5, not hdf5. If you installed weights file, that’s enough.

  11. If you don’t have a system with good GPU, then try to implement it on google colab which has pretty good processing power on GPU

  12. Great article. I have one question though. On every epoch, after calculating the weights for current epoch, model validates loss on all validation_data(xtest, ytest) ?
    If yes, Is there any way, to perform same in batches like we pass the xtrain in batches?

    I seem to have issues of memory overflow for all xtest (50000+) images loaded in numpy array.

    1. Validation performs on all test data. Instead of applying batches, you should select a smaller size sub set of test data and validate it. In this way, you can compare your loss with same conditions.

      1. Okay. But still, is there any way to perform validation step separately in a loop?

  13. Hi Sefik,
    thank you for your great article!
    I have a problem while getting the pixels values from the full_path.
    It seems that PIL can’t find the path of each image inside the full_path column.

    df[‘pixels’] = df[‘full_path’].apply(getImagePixels)

    FileNotFoundError: [Errno 2] No such file or directory: ‘wiki_crop/17/10000217_1981-05-05_2009.jpg’
    Best Regards

      1. Thank you for your amazing article
        I have exactly the same problem, I am sure that wiki_crop folder in the same directory of my notebook? Could you please help me to solve the problem?
        Thanks.

  14. Hi Sefik,

    First of all, thanks for publishing this tutorial and the pre-trained weights.
    I have a problem when trying to implement the face model. It gives an error for the conv2d layer (7,7), it says ‘negative dimension caused by subtracting 7 from 3…’.
    Did you occur a similar problem, or have a suggestion how to avoid this error?

    1. What is your tensorflow and keras version? I tested it on TensorFlow 1.9.0 and Keras 2.2.0.

        1. Could you try it same version of my environment? I could not solve this issue for that environment soon.

  15. Hi, thank you for being in your block and a dimension of the information you have provided to us. I’m constantly taking advantage of your blocks in my classes. I would really appreciate it if you help me I’m constantly getting error in your printer’s real-time prediction, and the other code may be the librarian who can ever walk away but what I’m doing is missing something.

  16. Hello Sefik,
    First of all thank you for this great presentation and word, it has truly helped me working on a similar project!
    I used imdb + wiki, cropped faces using Haarcascade (resulting in a 200k faces approximately), trained my model using transfer learning as well, my score is pretty similar, but i can’t explain the weird difference:
    [2.86, 24%], but an MAE of 6.8, using your multiplication technique.
    How would you explain this?

    1. The original idea is not mine. I applied some approach in the original paper as I mentioned. Herein, age prediction is not a classification problem. If you evaluate your predictor as classifier, then you count inaccurate 1 or 2 year deviations. On the other hand, this is very satisfactory prediction even for people. Researchers find a robust way to transform a classification problem to a regression problem.

      They might build a neural networks with a single output node. In this way, they can build a regressor but I think this approach shows more accurate results.

      1. Ok yes apparently it does.
        And what do you think about the fact that the data is highly unbalanced? I feel like the model tends to predict a lot between 30 and 50, and has trouble predicting young people age.
        What would you suggest to fight that?

        1. The only solution is to increase the sample data. You are absolutely right. Because I tested the model on kids and kids less than 7 years old seems 20 years old based on the model prediction 🙂

  17. Thank you and it is a great article to follow.

    I am working on the age estimation. One thing I noticed is that the data in your training loss vs validation loss plot is different from the final result you showed. You achieved around 3.2-3.6 validation error but in your result you are showing the validation error is around 2.8 and achieved 24% accuracy for age estimation.

    You mentioned that you actually used both wiki and imdb dataset to achieve this. But I used both datasets as well and final achieved around 3.2 for validation error and 12% accuracy for age estimation and MAE around 6.

    I was wondering could you explain exactly what you changed here for the code to achieve higher accuracy and lower MAE. Do not need to be the same but at least I would like to be closer to your results. Thank you!

    1. I just combine wiki and imdb data sets and applied batch learning. Even I cannot get same score if I re-train the model from scratch. Luckily, I stored the pre-trained weights to re-use.

      1. Thanks for your reply Sefik. It’s normal that there will be a bit variation since each epoch is different and assigned images randomly. But there must be a range of variation, if anything between 20%-28% for accuracy, I guess it could be caused by variation. But I can only achieve 12% for accuracy, what would be your ‘normal’ accuracy rate and any suggestion how to improve? Would the change of optimizer to sgd instead of adam work or maybe another pre-train model such as inception V3 instead of vgg?

        1. Nope. The model was VGG-Face and optimizer was Adam but you might try to use SGD. Of course you should spend more time in this case.

  18. where is classification _age _model.hdf5 and classification _age _model.hdf5
    please tell me because without it this code is not run

  19. Thanks for this great work, i highly appreciate. When i use the imdb dataset, the full path is in this format [01/nm0000001_rm124825600_1899-5-10_1968.jpg], im unable to get image pixels from this file format. thanks in advance

    1. The program you are running in imdb_crop folder? I mean that the both 01 folder and your program have to be in same folder.

  20. Hi, Sefik!

    Thanks a lot for your post. He very clearly and consistently explains the solution to the problem of determining age from a photo. Perhaps this is the best thing that can be found on the net on the topic of age determination.

    I would like to expand the range of age definitions towards 0-15 years. I have a dataset with pictures with these ages. How can I train your model on this dataset?

    1. Thank you firstly. You should apply the procedures mentioned in this post. In other words, you need to retrain it from scratch.

      1. Thanks. How long did it take you to train the neural network? And what equipment was used, if not a secret?

  21. Hi, Sefik!
    Another question arose. In the section of the script with the preparation of input data for training the network, there is a commented out line
    #img = preprocess_input (img).
    Shouldn’t you uncomment this line? You use the VGG-Face model as a basis. She trained images in which pixels are normalized in the range [-1, +1] (https://sefiks.com/2018/08/06/deep-face-recognition-with-keras/). Why are pixels in this article normalized from 0 to 1?

    1. The both are true. In my experiments, normalizing in the range [0, 1] returns more robust results but if you normalize it in the range [-1, +1] then it will still work.

  22. I went through all the videos ,tutorials related to your Deepface project .You have explained all the things step by step very clearly and this is by far the best and comprehensive guide in the web for face delection/recognition i could find You have done an amazing work for all the students and to the open source community friend, . Really amazing .Keep up the good work

  23. I get this error when I’m trying to run…

    AttributeError: module ‘tensorflow.compat.v2.__internal__’ has no attribute ‘register_clear_session_function’

  24. Hi, Thanks for the amazing tutorial..

    The link to the pre-trained weight is broken, could you pin point where can I download it?

    Best

Comments are closed.