Skip to main content

3D Reconstruction with markers for multi view object alignment

In one of my projects, I needed to reconstruct an object for automatic painting by a robot arm. The target object is a hollow electricity meter box and the robot needs to paint the insides of this object. It is easy to paint the exterior of an object, but to paint the insides of a hollow object we need a decent enough point cloud to make sure our path planning works properly.

An illustration of the object to draw is given below


As indicated by the red arrows, the area just behind the small area is the target of the painter.

Since I don't have electricity boxes lying around, I tried my experiments on a cereal carton.



This is a long winding road of trials and errors and is divided into several parts. The first part is about the point cloud generation. Followed by experiments on camera movement tracking and scene and object reconstruction from point clouds. Finally a method to determine path points for the robot painter will be discussed.

Preliminary research

One of the first things to try was 3D reconstruction of the scene. Hopefully we can get a good representation of the object that is enough for the path finding algorithm.

3D Scene Reconstruction

Using a Realsense D415 depth camera, I attempted to capture the cereal box as point cloud.

Point cloud representation of the cereal box

This approach has a few problems, mainly:
  • Edges have a lot of noise
  • Sides that are not visible from the camera are not detected
To solve these problems, I looked into ways to merge multiple view point clouds so we can get a full representation of the box. I tried first with ICP and performing iterative point registration but the accuracy is not good enough so I tried using a marker to capture various views and merge the point clouds together.

Marker based multi-view point cloud reconstruction

The following is my attempt to get a better point cloud reconstruction from multi view using a marker of known pattern and size. The marker chosen was the chessboard calibration pattern widely used in camera calibrations. The following images depict the scene setup.







Three views of a sofa in the living room along with the chessboard pattern

The usage of a marker improved the reconstruction a lot and I managed to get the reconstructed sofa as shown below.


3D point cloud representation of the sofa



Color image of the sofa and its surroundings from the same view

Implementing marker based multi-view registration on the cereal box

So the next step will be to try the reconstruction process on the cereal box. The marker has also to be visible to the camera so this poses a small challenge in camera placement but overall I found that overcoming this is not so hard.




Two views of the cereal box next to a known marker

With this method, there is no need to incrementally register point clouds and we can just rely on the transformation matrix of the camera relative to the marker.

The result reconstruction is shown below.


We can see that the transformation matrix from the marker really helped in aligning the two views, even though they are far apart and we don't have incremental positions in between. The accuracy needs more work however and I believe this can be overcome by better hardware, and also more thorough calibration of the camera parameters. Using multiple markers may also improve the transformation matrix estimation.

Future work

A better calibration process, along with multiple markers seem to be promising to increase accuracy of alignment.

Comments

Popular posts from this blog

Using FCM with the new HTTP v1 API and NodeJS

When trying to send FCM notifications I found out that Google has changed their API specifications. The legacy API still works but if you want to use the latest v1 API you need to make several changes. The list of changes is listed on their site so I won't be repeating them again but I'll just mention some of the things that caused some trial and error on my project. The official guide from Google is here : Official Migration Guide to v1 . The request must have a Body with the JSON containing the message data. Most importantly it needs to have "message" field which must contain the target of the notification. Usually this is a Topic, or Device IDs. Since my previous project was using GAS, my request had a field called "payload" instead of "body". Using the request from my previous project, my request in Node JS was as follows: request ({ url: 'https://fcm.googleapis.com/v1/projects/safe-door-278108/messages:send' , method: ...

Building a native plugin for Intel Realsense D415 for Unity

Based on a previous post , I decided to write a plugin for the Intel Realsense SDK methods so we can use these methods from within Unity. FYI Intel also has their own Unity wrapper in their Github repository , but for our projects, I needed to perform image processing with OpenCV and passing the results to Unity instead of just the raw image/depth data. There is a plugin called OpenCVForUnity to use OpenCV functions from Unity but previous experiments indicate the image processing inside Unity can take a long time. I hope this post can help someone else who wants to use Intel's cameras or any other devices natively in Unity. Test Environment Windows 10 64bit Unity 2017.2.0f3 x64 bit Realsense SDK from Intel CMake 3.0 or higher Steps Checkout the native plugin code here . Don't worry about the other projects in the same repository. The relevant code is in the link above. Checkout the Unity sample project here . However, instead of master you need to go to the br...

Microsoft Azure Face API and Unity

During one of my projects, I came across Microsoft's face recognition API (Azure Face API) and it looked good enough to recognize people's faces and detect if a person is a newcomer or a repeating customer to our store. As our installations mainly use the game engine Unity, I wanted to be able to use the Face API from Unity. Face API does not have an SDK for Unity but their requests are just HTTP requests so the Networking classes in Unity can be wrapped into methods to make it easy to call these APIs. First of all, to those who just want to see the code, here it is . My tests focus on the identification of a face in an input image. The full tutorial I followed can be found here . The Main scene goes through the steps in the tutorial such as creating a PersonGroup and adding Persons to the group if it is not created yet. Just make sure you: Change the API key. I used a free trial key which is no longer valid. Use whatever images you want. I don't mind you us...