Object detection with Google Colab and Tensorflow

This is just a memo of the challenges I faced when running a model training on Google Colab, while following a great tutorial here.

Mind the versions

Tensorflow is currently at version 2.2.0 but most tutorials are still using the contrib package, and there is no known easy way to update the code to remove dependency on contrib. So my best bet is to downgrade the tensorflow version to 1.x. Since Google Colab only gives the options of either 1.x or 2.x and we cannot specify the exact version, I ended up with version 1.15.2.

Even with the command : %tensorflow_version 1.15.0

I ended up with : 1.15.2

Another pitfall was the version of numpy. Installing numpy gives us the version 1.18.3 but for some reason this generates the error :

TypeError: 'numpy.float64' object cannot be interpreted as an integer

Downgrading numpy to version 1.17.4 solved this for me.

It seems we don't need ngrok for tensorboard

With the command : %load_ext tensorboard

We can view tensorboard directly on Google Colab

The generated TF Record files are of version 1.6

Another error I encountered was :

Signature mismatch. Keys must be dtype <dtype: 'float32'>, got <dtype: 'string'>

This is resolved by adding dct_method as follows:

Modify the xml_to_csv method (maybe needed)

In the xml_to_csv_method there is this line:

value = (root.find('filename').text + '.jpg',

If you use labelimg and labels your own image dataset, you may have the filename labeled with the extension (.jpg). Check your XML and if you already have the extension, you can remove the + '.jpg' part from the code.

Afterwords

Machine learning is very time consuming, especially the data preparation step. I didn't get a good enough detection with my own dataset, so I went for a published open dataset. The amount of data is thousands more than I had. If I had to do the data gathering and labeling myself there is no way I can finish any project in a reasonable time.

Furthermore, the training process is also very time consuming even with high powered PCs. If you make a wrong step or missed a parameter, redoing the process can be super time consuming.

I hope these two problems can be addressed in the near future. They are what is preventing machine learning from being more common in the masses.

Time is a limited resource

Search This Blog

Object detection with Google Colab and Tensorflow

Mind the versions

It seems we don't need ngrok for tensorboard

The generated TF Record files are of version 1.6

Signature mismatch. Keys must be dtype <dtype: 'float32'>, got <dtype: 'string'>

Modify the xml_to_csv method (maybe needed)

Afterwords

Comments

Post a Comment

Popular posts from this blog

Using FCM with the new HTTP v1 API and NodeJS

Microsoft Azure Face API and Unity