Machine learning researchers would like to share outcomes. They might spend a lot of time to construct a neural networks structure, and train the model. It may last days or weeks to train a model. They might also run learning process on highly cost hardware such as GPUs and parallelized systems. However, one can run the same model in seconds if he has the pre-constructed network structure and pre-trained weights. In this way, learning outcomes transferred between different parties. Furthermore, you don’t need to have a large scale training dataset once learning outcomes transferred.
Vlog
The following vlog covers the transfer learning topic as well.
🙋♂️ You may consider to enroll my top-rated machine learning course on Udemy
Matrix
BTW, I know kung fu scene is a metaphor in Matrix mentioning transfer learning.
Image Recognition
Formerly, developing chess winner algorithm is thought as the most suspenseful challenge for AI studies. After then, we’ve realized that classifying images is more suspenseful challenge than playing chess.
Previously, traditional computer vision algorithms are applied to recognize images. For example, the following illustration states the ImageNet results in timeline. ImageNet consists of 1.2M images of 1000 different categories.
We’ve gotten stuck in almost 30% error rate with traditional computer vision. Applying deep learning to this field changes the course of history. Error rates jumped to 15% in an instant. The orange node appearing in 2012 states AlexNet.
ImageNet Winner Models
AlexNet changes the course of history but today we’ve gone further much more. Inception V3 model produces almost 3% error rate in 2014. These common imagenet models are supported by Keras. We can transfer their learning outcomes with a few lines of code.
Inception V3
Inception V3 is a type of Convolutional Neural Networks. It consists of many convolution and max pooling layers. Finally, it includes fully connected neural networks. However, you do not have to know its structure by heart. Keras would handle it instead of us.
We would import Inception V3 as illustrated below.
from keras.applications.inception_v3 import InceptionV3 from keras.applications.inception_v3 import preprocess_input from keras.applications.inception_v3 import decode_predictions
Also, we’ll need the following libraries to implement some preprocessing steps.
from keras.preprocessing import image import numpy as np import matplotlib.pyplot as plt
Constructing Inception
It is easy to construct Inception V3 model. Weights would be installed automatically when you run the model construction command first time. Specifying weights parameter as imagenet provides to use pre-trained weights for imagenet challenge. Defining it as none initializes weights randomly.
model = InceptionV3(weights='imagenet', include_top=True)
We can monitor the pre-constructed structure and pre-trained weights once model is loaded.
print("model structure: ", model.summary()) print("model weights: ", model.get_weights())
Now, we have pre-constructed network structure and pre-trained model for imagenet winner model. We can ask anything to Inception V3. I prefer to put dozens of images in a folder and name these images with increasing indexes. BTW, I found these images randomly on Google.
We would ask the most probable 3 candidates for each image. Then, display image and its predictions together.
for i in range(1, 17): img_path = 'testset/%s.jpg' % (i) img = image.load_img(img_path, target_size=(299, 299)) x = image.img_to_array(img) x = np.expand_dims(x, axis = 0) x = preprocess_input(x) features = model.predict(x) print(decode_predictions(features, top = 3)) plt.imshow(image.load_img(img_path)) plt.show()
It seems that InceptionV3 results are satisfying. Based on my observations, Inception V3 is good at recognizing animal species, but may fail at recognizing pedigreed versions. For example, when I ask the model to predict british shorthair, it predicts as persian cat.
So, we’ve transferred the learning outcomes for imagenet winner model InceptionV3 to recognize cat and dog images. Even though the model is trained for 1.2M images of 1000 different categories, we can consume it in seconds and produce same results. As Prof. Andrew mentioned that transfer learning will be the next driver of ML success.
Customization
Even though kung fu knowledge is transferred to Neo in Matrix, he connected to Matrix and trained with Morpheus. We can do it in transfer learning as well.
We know that advanced deep neural networks models such as VGG or Inception already learnt facial attributes in its middle and high level features.
However, these models can learn it with a million level data by hundreds of researchers having strong computation power (CPU, GPU and TPU). We mostly neither have this size data nor computation power.
Still we can use these pre-trained models and train it with a small set of data. The trick is here to freeze or lock the early layers and let the final layers to be trained. In this way, our new model can know the facial patterns in middle and high level features as well.
Here, you can find a deep study about customization in transfer learning. It covers modifying regular VGG model to age and gender prediction model with a little effort. Results are very satisfactory. You can see in the following video.
Training from scratch
On the other hand, if we have millions of data and train the model from scratch. I mean that if we don’t freeze early layer weights, then that model would become more successful. The approach we’ve applied is just a practical, it is not the best.
Conclusion
So, we’ve mentioned the different approaches for transfer learning, its pros and cons. Transfer learning basically proposes you not to be a hero. Pre-trained models offers you the fastest solutions.
The code of the project is shared on GitHub. You can run your own testings for different images on different models.
Support this blog if you do like!
1 Comment
Comments are closed.