Skip to main content

Pitfalls during Training and Object Detection with TensorFlow for Absolute Beginners

This article is based on the great tutorial here on how to train and detect custom objects with Tensorflow. I also referred to the official documentations here and here for running Tensorflow model building locally. It was my first custom detection project and I faced some hiccups along the way and this article is to log and share my finding so it can help other beginners like me. In the end, I managed to train a tensorflow model to detect Batsumaru, a character from Sanrio.

This is how the detection will look like.


The tools

  1. Windows 10 Pro 64
  2. Tensorflow originally 1.7.1 and upgraded to 1.12.0. I will share the reason later.
  3. Python 3.5.4
  4. LabelImg for image labeling
  5. PyCharm IDE

Steps and Pitfalls

Some of the mistakes I made and other discoveries when following the guide. I will not repeat the steps mentioned in the original guide, but only the parts where I had to deviate from the walkthrough and found out things by myself.
  1. The training and testing images has to be RGB. This is because Tensorflow expects a certain number of channels in the image, and this particular model expects 3 channels (RGB). This discussion helped point me in the right direction. (Should have read the guide more carefully!)
  2. Preparing the tools at tensorflow/models/research. This step was particularly rough for me since I was totally new to Tensorflow. I didn't understand at first that we need another set of scripts to generate a custom model which is located here. If we just want to do detection on a generated model we only need the core library here. Since we will be generating our own model, we need both.
  3. When generating the TFRecord files, I encountered the error 'cannot import name string_int_label_map_pb2'. This means we need to compile Proto files in tensorflow/models/research/object_detection/protos.
  4. When trying to compile the PROTO files, I found that Windows version of Protoc does not support wildcard character. So we cannot compile 'all proto files in a folder' and instead have to compile one by one. However, the workaround is to use Windows Shell script to run a loop as such : do D:\Libraries\vcpkg\installed\x64-windows\tools\protoc object_detection\protos\%G --python_out=.
    1. Subnote : When compiling with protoc, if you get no input errors, you probably need an additional space like mentioned here.
  5. When running the object_detection/model_main.py, I encountered the error 'from pycocotools import coco \ ModuleNotFoundError: No module named 'pycocotools''. I needed to install cocoapi but on Windows it was not supported officially ! Luckily I found a port to Windows here. You can follow the discussion on Protoc repository too.
  6. When running the object_detection/model_main.py, I encountered the error : non_max_suppression tensorflow unexpected keyword score_threshold. This is caused by Tensorflow version <1.9 not supporting 'score_threshold'. This is why I upgraded my Tensorflow to use v1.12.0 instead. After updating Tensorflow version we must also use the corresponding CUDA and cuDNN library. I found that CUDA 10 and cuDNN 7.4 works for me. Also, don't forget to change CUDA_HOME in env settings.
The actual command I used to build the model:

py .\object_detection\model_main.py --pipeline_config_path=D:\Workspace\TF_TrainDetect\
training\models\model\ssd_mobilenet_v1_pets.config --model_dir=D:\Workspace\TF_TrainDetect\training\output --num_train_s
teps=50000 --sample_1_of_n_eval_examples=1 --alsologtostderr

Some additional notes when running the above command:
  1. Make sure the number of training steps (num_train) is appropriate for your needs. During my training of 50000 steps it took almost 18 hours to complete.
  2. model_dir is the folder where the generated models will be saved.

The actual command I used to export the model:
 py .\object_detection\export_inference_graph.py --input_type=image_tensor --pipeline_config_path=D:\Workspace\TF_TrainDetect\training\models\model\ssd_mobilenet_v1_pets.config --trained_checkpoint_prefix=D:\
Workspace\TF_TrainDetect\training\output\model.ckpt-50000 --output_directory=D:\Workspace\TF_TrainDetect\training\export
ed_model

Some additional notes when running the above command:
  1. For trained_checkpoint_prefix, I chose the final checkpoint, at 50000 steps, and we can change this to any checkpoint that is deemed good enough for the detection.
  2. output_directory is where the frozen_inference_graph.pb file will be generated. This frozen graph is the final model to be used in the detection phase.

Detection Phase

For the detection test, I searched for videos containing the character and used those as input. The detection results were inaccurate when there are other characters present and when the size of the character in the image is too small. This is due to the low number of images used in training, and also because most of the images are too similar. An improvement to the images used in training will definitely boost the accuracy of the detection.

The detection test codes can be found here.
The /training folder is not included due to file size restrictions on Github, if there are any unclear steps due to this please let me know.

All the best!

Comments

Post a Comment

Popular posts from this blog

Installing a custom ROM on Android (on the GT-N8013)

It's been a while since my last entry and since it is a new start in 2019, I thought I'd write something about "gone with the old and in with the new". I've had my Samsung Galaxy Note 10.1 (pnotewifi) since 2014, and it's one of the early Galaxy Note tablet series. It has served me well all this years but now it just sits there collecting dust. My old Samsung GT-N8013 I've known a long time about custom Android ROMs like CyanogenMod but has never had the motivation to try them out, until now ! Overview of the process For beginners like me, I didn't have an understanding of the installation process and so it looked complicated and it was one of the reasons I was put off in trying the custom ROM. I just want to say, it's not complicated at all!   Basically you will need to Prepare an SD card and install Android SDK (you need adb ). Install a custom boot loader ( TWRP is the de facto tool at the moment). Use adb to copy custom...

Building a native plugin for the Asus Xtion2 for Unity

During one of the projects in my company I needed to build a native plugin to let Unity communicate with the Asus Xtion2, specifically to get its depth data. We used to be able to do this pretty easily with the Kinect but since Microsoft discontinued it, we need to start looking for alternatives. Test Environment Windows 10 64 bit. Unity 2017.2.0f3 x64. Important! Choose x64 or x86 to match your Unity installation. OpenNI For Xtion2 SDK. The official SDK is somewhat different from the OpenNI SDK provided by Asus but it should behave the same. The one provided by Asus can be downloaded here . Make sure you choose the latest one. CMake 3.0 or higher Who is it for Someone who has been using Unity for some time and is comfortable with the concept of classes and objects. It will be very helpful if you know C++ and pointers too. Steps Since the code is provided, I will only go over the major steps. Let me know in the comments if I miss anything. Build and install OpenNI...