6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images

View Researcher's Other Codes

Disclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).

Please contact us in case of a broken link from here

Authors Di Wu, Canqun Xiang, Wenbin Zou and Xia Li, Zhaoyong Zhuang
Journal/Conference Name The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019 2019 6
Paper Category
Paper Abstract We present a conceptually simple framework for 6DoF object pose estimation, especially for autonomous driving scenario. Our approach efficiently detects traffic partic- ipants in a monocular RGB image while simultaneously regressing their 3D translation and rotation vectors. The method, called 6D-VNet, extends Mask R-CNN by adding customised heads for predicting vehicle s finer class, ro- tation and translation. The proposed 6D-VNet is trained end-to-end compared to previous methods. Furthermore, we show that the inclusion of translational regression in the joint losses is crucial for the 6DoF pose estimation task, where object translation distance along longitudinal axis varies significantly, e.g., in autonomous driving sce- narios. Additionally, we incorporate the mutual informa- tion between traffic participants via a modified non-local block. As opposed to the original non-local block imple- mentation, the proposed weighting modification takes the spatial neighbouring information into consideration whilst counteracting the effect of extreme gradient values. Our 6D-VNet reaches the 1 st place in ApolloScape challenge 3D Car Instance task1 [21]. Code has been made available at: https://github.com/stevenwudi/6DVNET.
Date of publication 2019
Code Programming Language Python
Comment

Copyright Researcher 2022