Author:
Pin-Wei "David" Chen
E-mail:
ccpwearth@gmail.com
Code:
https://github.com/championway/gan_rv_thesis
Thesis:
PDF
Leveraging highly developed deep learning and artificial intelligence, computer vision technology and applications reached new levels. Computers can now not only perform image processing, classification, and object detection, but also can ”create” images similarly to humans, due to generative model developments. In particular, the generative adversarial network (GAN) provides many architectures and applications, such as image style transfer, human face generation, image generation from text, etc. However, there has been little study regarding applying GAN to real-robot missions to replace and improve other approaches. Therefore, this work proposed two GANs: FCN-Pix2Pix and SSIM-CycleGAN, based on Pix2Pix and CycleGAN respectively, and implemented them for two real-robot missions which still face some challenges with modern solutions: semantic segmentation and virtual dataset from sim to real. The proposed approaches were also compared with current state-of-the-art approaches, verifying significant advantages for the proposed methods.
Dataset | Size | Image | Real totes | Unity totes | Dataset(.tar) |
Tote Sim2Real Sample | 25.9MB | 100 images | 100 images | 99 images | sim2real_sample.zip |
Tote Sim2Real | 19.8GB | 68608 images | 5690 images | 9472 images | sim2real.zip |