Skip to content

[TPAMI 2024] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

License

Notifications You must be signed in to change notification settings

tjiiv-cprg/EPro-PnP-v2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EPro-PnP v2

This repository contains the upgraded code for the CVPR 2022 paper EPro-PnP, featuring improved models for both the 6DoF and 3D detection benchmarks.

A new updated preprint can be found on arXiv: EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation.

Models

EPro-PnP-Det v2: state-of-the-art monocular 3D object detector

Main differences to v1b:

  • Use GaussianMixtureNLLLoss as auxiliary coordinate regression loss
  • Add auxiliary depth and bbox losses

At the time of submission (Aug 30, 2022), EPro-PnP-Det v2 ranks 1st among all camera-based single-frame object detection models on the official nuScenes benchmark (test split, without extra data).

Method TTA Backbone NDS mAP mATE mASE mAOE mAVE mAAE Schedule
EPro-PnP-Det v2 (ours) Y R101 0.490 0.423 0.547 0.236 0.302 1.071 0.123 12 ep
PETR N Swin-B 0.483 0.445 0.627 0.249 0.449 0.927 0.141 24 ep
BEVDet-Base Y Swin-B 0.482 0.422 0.529 0.236 0.395 0.979 0.152 20 ep
EPro-PnP-Det v2 (ours) N R101 0.481 0.409 0.559 0.239 0.325 1.090 0.115 12 ep
PolarFormer N R101 0.470 0.415 0.657 0.263 0.405 0.911 0.139 24 ep
BEVFormer-S N R101 0.462 0.409 0.650 0.261 0.439 0.925 0.147 24 ep
PETR N R101 0.455 0.391 0.647 0.251 0.433 0.933 0.143 24 ep
EPro-PnP-Det v1 Y R101 0.453 0.373 0.605 0.243 0.359 1.067 0.124 12 ep
PGD Y R101 0.448 0.386 0.626 0.245 0.451 1.509 0.127 24+24 ep
FCOS3D Y R101 0.428 0.358 0.690 0.249 0.452 1.434 0.124 -

EPro-PnP-6DoF v2 for 6DoF pose estimation

Main differences to v1b:

  • Improve w2d scale handling (very important)
  • Improve network initialization
  • Adjust loss weights

With these updates the v2 model can be trained without 3D models to achieve better performance (ADD 0.1d = 93.83) than GDRNet (ADD 0.1d = 93.6), unleashing the full potential of simple end-to-end training.

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{epropnp, 
  author = {Hansheng Chen and Pichao Wang and Fan Wang and Wei Tian and Lu Xiong and Hao Li, 
  title = {EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation}, 
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
  year = {2022}
}

Releases

No releases published

Packages

No packages published