Skip to main content

Posts

Pokemon Classification by Siamese Network and One Shot Learning

Some of my recent clients are interested in image classification using limited learning data. A major use case is in detecting defective products on the manufacturing line. Defective products are normally caught by human operators watching the production line. It requires constant concentration and effort so if this can be automated it will bring a lot of productivity boost to manufacturers. One method to overcome this issue is by creating a network to compare input images with the training dataset. For defective products, the images will look different from the passing products. Experiments For this experiment, I used a Siamese Network to generate embeddings for images of Pokemon. The Pokemon has 4 classes, Pikachu, Squirtle, Bulbasaur and Charmander. The images are scraped from the internet using the Bing API. The images in the training dataset is trained to find the set of weights that will clearly separate the 4 classes. If a new image which is not included in the 4 classes is used...
Recent posts

Camera and Projector Calibration for Projection Mapping

In this post I'd like to share my experiments in calibrating a projector using a camera with known intrinsic and extrinsic parameters. No math, just concepts. The final result is a projected image that follows a marker's position and orientation, and adjusts the image. The calibration process enables us to automatically setup the relationship between the projector and the camera that detects the marker. In short, these are the steps performed Calibrate camera and projector relative position and orientation Detect the marker using the camera Adjust the image position and orientation based on the marker position and orientation Experiment setup The following videos show the results of the experiments.   As you can see, the projected dots follow the position and orientation of the chessboard marker. This setup will be very useful for projection mapping installations. Of course, at the current setup it will be useful only on small sizes but by using a larger marker we can theoretic...

Pose recognition with Intel's Realsense and OpenVino

Just a short post today on Intel's Openvino toolkit. It is really a great collection of machine learning models that can speed up development. I played around with their pose estimation model and the results are quite good. Final result of the pose detection system This system combined with the Intel Realsense depth cameras offer a promising real world solution to the pose estimation problem. As shown in the video, it works reasonably well on my laptop with an NVIDIA GPU (1060) Graphics card. The white spheres indicate the feet limbs and thanks to the Realsense depth camera we can estimate the positions of the limbs in 3D space. Although I haven't tried it yet but the models can be converted to work on edge platforms such as the Movidius device. Obviously this opens up a whole array of applications which I hope to share again in a future post.

3D Reconstruction with markers for multi view object alignment

In one of my projects, I needed to reconstruct an object for automatic painting by a robot arm. The target object is a hollow electricity meter box and the robot needs to paint the insides of this object. It is easy to paint the exterior of an object, but to paint the insides of a hollow object we need a decent enough point cloud to make sure our path planning works properly. An illustration of the object to draw is given below As indicated by the red arrows, the area just behind the small area is the target of the painter. Since I don't have electricity boxes lying around, I tried my experiments on a cereal carton. This is a long winding road of trials and errors and is divided into several parts. The first part is about the point cloud generation. Followed by experiments on camera movement tracking and scene and object reconstruction from point clouds. Finally a method to determine path points for the robot painter will be discussed. Preliminary research One of the first things t...

Counting how many people walks by using Python, tensorflow and height estimation

In this article, I'd like to share my project to count the number of people walking past a store/shop etc. The motivation was when I was part of a family oriented digital theme park company, we would want to know if daily sales (visitors) are good enough.  First, some background. Our theme park (store) is part of a department store/shopping complex. So we did not have a large piece of land like Disneyland, where all visitors are definitely there to visit the park. In our case, visitors to the shopping complex will consider taking their children to our entertainment theme park, so we would get a certain percentage of the total visitors of the shopping complex. So for example if we have 100 visitors on one single day, how do we know if this number is good or bad? If we know the number of people passing through our store we would have a better context. If the total number of people is say around 500 then we would be getting 20%, which doesn't seem so bad. But if the shopping compl...

Using FCM with the new HTTP v1 API and NodeJS

When trying to send FCM notifications I found out that Google has changed their API specifications. The legacy API still works but if you want to use the latest v1 API you need to make several changes. The list of changes is listed on their site so I won't be repeating them again but I'll just mention some of the things that caused some trial and error on my project. The official guide from Google is here : Official Migration Guide to v1 . The request must have a Body with the JSON containing the message data. Most importantly it needs to have "message" field which must contain the target of the notification. Usually this is a Topic, or Device IDs. Since my previous project was using GAS, my request had a field called "payload" instead of "body". Using the request from my previous project, my request in Node JS was as follows: request ({ url: 'https://fcm.googleapis.com/v1/projects/safe-door-278108/messages:send' , method: ...

Scan doodles and watch them come alive!

In this post I'd like to share about one of my projects involving doodles and bringing them to live with Unity. We prepare doodle papers and some crayons and let children color them. After they're done, we scan the images and they appear on the screen. The screen is projected on walls using projectors. Doodles come alive on the screen Project flow I utilized document scanners readily available such as the following. A document scanner The scanner has many helpful features, such as cropping and rotating the scanned images so they are nice and straight even if the paper is slightly rotated or not aligned properly. The scanned images are stored in a server and a Unity application polls the server for new images every few seconds. For the server I initially used AWS S3 for image storage and later on we switched to a local image server with Node JS. Attaching 2D Texture to a 3D Model I no longer have access to the actual doodle papers but they look l...