With Transfer Learning is possible to take a pre-trained network, and use it as a starting point for a new task
It’s not a secret for anyone familiar with image classification that CNN’s need a considerable amount of images, parameter tuning, and processing time to output a reasonable result.
The good news is: with Transfer Learning is possible to take a pre-trained network (for a set of images for instance), and use it as a starting point for the training of a new task. For example, A network trained for recognizing motorcycles can be retrained to identify bicycles. Note that the problems are different, yet, related to each other.
Convolutional Layers usually recognize, edges, intensities, textures, and shapes at some point. We may utilize that kind of knowledge the network already has, and generalize it for another task.
The most common way of doing Transfer Learning is using models already present in literature such as VGG, Inception, and MobileNet. Those networks are trained using images and classes from datasets with thousands of images and classes such as ImageNet or CIFAR.
By using these datasets, the network will learn the common features of a large number of different objects. So, what would happen if we add some dense layers at the end of the already trained network and use the model to learn about new objects…? Guess what, we are transferring the network knowledge to a (partially) new one!
Let’s see how to do that in practice. Code time.
Requirements
- Python
- Pillow
- Numpy
- Tensorflow
- Keras
- Matplotlib
Code
For doing our transfer learning, first, we need to choose an already-trained network. Here, VGG16 is a good choice, because it has already demonstrated state-of-the-art performance in object classification tasks, winning the ILSVRC 2014 (ImageNet Large Scale Visual Recognition Competition) in the classification task. Figure 1 shows the VGG16 architecture.
Keras (with Tensorflow backend) has a great set of tools to use VGG16 and also other models, with the option of loading pre-trained weights. The code for loading the model is:
1 2 3 | #include_top: whether to include the fully-connected layer at the top of the network. #For transfer learning, just the dense layer will learn about the new classes. vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)) |
Note that the dense layer is not included, our job is to create and train a new one. That new dense layer will be trained/tested/validated using images after a single pass of each one through all the VGG16 Convolutional Layers.
There are some functions that have to be implemented. Keras also provides an easy way to load an image and generate batches of tensor image data. The function below does exactly that, with both test and train datasets.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | def data_generator(network, generator, number_of_images, batch_size, shape_features, shape_labels): ''' inputs: network: The network to process the images generator: The image_data generator of the given inputs (train or test) number_of_images: Number of images present in the inputs batch_size: Size of the batch shape_features: Format of the features shape_labels: Format of the labels return: features and labels of the given class after a single pass through the network of all input images ''' features = np.zeros(shape=shape_features) labels = np.zeros(shape=shape_labels) i = 0 for inputs_batch, labels_batch in train_generator: #Due to the network format, every image goes through it to have the expected dimensions features_batch = network.predict(inputs_batch) features[i * batch_size : (i + 1) * batch_size] = features_batch labels[i * batch_size : (i + 1) * batch_size] = labels_batch i += 1 if i * batch_size >= number_of_images: break return features, labels |
The code snippet below shows the functions that count folders (classes) and images for the current dataset.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | #General use function to count the number of images and folders (classes) inside a directory def count_number_of_folders_and_files(directory): ''' inputs: directory: The directory where the recursive process will take place return: totalFiles,totalFolders The number of files and folders ''' totalFolders = 0 totalFiles = 0 for root, dirs, files in os.walk(directory): totalFiles += len(files) totalFolders += len(dirs) return totalFiles,totalFolders #Load the image data def imagedata_generator(folder, size, batch_size): ''' inputs: folder: the directory where the images are located size: The size of the image batch_size: The size of the batch return: An ImageDataGenerator containing all the images of a given directory ''' imagedata_gen = ImageDataGenerator(rescale=1./255) imagedata_gen = imagedata_gen.flow_from_directory(folder, target_size=size, batch_size=batch_size, class_mode='categorical', shuffle=True) return imagedata_gen print ("Counting folders and files...") number_of_training_images, number_of_classes = count_number_of_folders_and_files(train_dir) number_of_testing_images, number_of_classes = count_number_of_folders_and_files(test_dir) print ("Done!") |
After that, we begin to generate the image data accordingly to VGG16 architecture.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | print ("Train data:") train_generator = imagedata_generator(folder=train_dir, size=(224, 224), batch_size=batch_size) print ("Test data:") test_generator = imagedata_generator(folder=test_dir, size=(224, 224), batch_size=batch_size) print ("Generating train features and labels") train_features, train_labels = data_generator(vgg_conv, train_generator, number_of_training_images, batch_size, shape_features=(number_of_training_images, 7, 7, 512), shape_labels=(number_of_training_images,number_of_classes)) print ("Generating test features and labels") test_features, test_labels = data_generator(vgg_conv, test_generator, number_of_testing_images, batch_size, shape_features=(number_of_testing_images, 7, 7, 512), shape_labels=(number_of_testing_images,number_of_classes)) print ("Done!") train_features = np.reshape(train_features, (number_of_training_images, 7 * 7 * 512)) test_features = np.reshape(test_features, (number_of_testing_images, 7 * 7 * 512)) |
All set: time to train the dense layer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | from keras import models from keras import layers from keras import optimizers model = models.Sequential() model.add(layers.Dense(256, activation='relu', input_dim=7 * 7 * 512)) model.add(layers.Dropout(0.5)) model.add(layers.Dense(number_of_classes, activation='softmax')) #Before training a model, you need to configure the learning process, which is done via the compile method. model.compile(optimizer=optimizers.RMSprop(lr=2e-4), loss='categorical_crossentropy', metrics=['accuracy']) history = model.fit(train_features, train_labels, epochs=training_epochs, batch_size=batch_size, verbose=2, # show the progress one time per epoch validation_data=(test_features,test_labels)) |
The fit() function does everything. After it finishes, the Network is ready to be used on a new task.
Please note that the code here is to give a detailed explanation on the blog and a better output on Jupyter Notebook. For the working code as it should be, please take a look at my Github page. Have fun!
Hi Jean,
Thanks for the great articles!
Could you confirm how to properly save a model? Does it need freezing?
I have followed 3 different transfer learning models and I am able to finish training and save .pb files but these files never load correctly. I’ve heard mentions about freezing the model and weights but I don’t see many demonstrations of this.
Can you provide any guidance? 🙂
Thank you!
Jordan
Hi Jordan!
If you just want to save and load a model using Keras, it’s not complicated:
# Save the model
model.save('path_to_my_model.h5')
# Recreate the exact same model purely from the file
new_model = keras.models.load_model('path_to_my_model.h5')
Hi Jean,
Thanks for the reply!
I’m actually trying to load my TF model into OpenCV and I’ve been following your guide: https://jeanvitor.com/tensorflow-object-detecion-opencv/.
Do you know if I can simply load a frozen .pb file without .pbtxt?
I have a frozen .pb file after following a TF tutorial on Transfer Learning (https://www.tensorflow.org/hub/tutorials/image_retraining).
Is the .pbtxt strictly required for loading the frozen .pb file into OpenCV?
I am able to load a .pb file and create a network but I try forward pass I get this error:
error: OpenCV(4.1.0) ..\modules\dnn\src\dnn.cpp:524: error: (-2:Unspecified error) Can’t create layer “input/BottleneckInputPlaceholder” of type “PlaceholderWithDefault” in function ‘cv::dnn::dnn4_v20190122::LayerData::getLayerInstance’
When I try to create network by loading .pb file and a .pbtxt file that was created based on available .config files (like mentioned in your guide, https://jeanvitor.com/tensorflow-object-detecion-opencv/), I encounter errors that prevent me from creating my model correctly. I imagine part of these issues is because I’ve created a custom model.
Thanks!
Hi Jordan,
For a personal experience, the OpenCV’s DNN module is really problematic. Usually, It has problems with some layer types (like the Batch Normalization ones).
So, as you said, it’s very likely that this problem comes from some layer that OpenCV messes up during use.
.pb and .pbtxt are just different formats of the same thing, one is binary and the other, text.
Good luck!