Code for paper "Densely Constrained Depth Estimator for Monocular 3D Object Detection"

Abstract: Estimating accurate 3D locations of objects from monocular images is a challenging problem because of lacking depth. Previous work shows that utilizing the object's keypoint projection constraints to estimate multiple depth candidates boosts the detection performance. However, the existing methods can only utilize vertical edges as projection constraints for depth estimation. So these methods only use a small number of projection constraints and produce insufficient depth candidates, leading to inaccurate depth estimation. In this paper, we propose a method that utilizes dense projection constraints from edges of any direction. In this way, we employ much more projection constraints and produce considerable depth candidates. Besides, we present a graph matching weighting module to merge the depth candidates. The proposed method DCD (Densely Constrained Detector) achieves state-of-the-art performance on the KITTI and WOD benchmarks. Code is released at this https URL.

DCD

Released code for Densely Constrained Depth Estimator for Monocular 3D Object Detection (ECCV22). arxiv Yingyan Li, Yuntao Chen, Jiawei He, Zhaoxiang Zhang

Code still needs to be cleaned, please wait a few days.

Environment

This repo is tested with Ubuntu 16.04, python==3.8, pytorch==1.7.0 and cuda==10.1.

conda create -n dcd python=3.8
conda activate dcd
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.1 -c pytorch
pip install -r requirements.txt

You also need ot build DCNv2 and this project as:

cd DGDE/models/backbone/DCNv2
python setup.py develop
cd ../../..
python setup.py develop

Directory Structure

We need KITTI dataset and keypoints annotation Google Drive.

After download them, please organize as:

|DGDE
  |dataset
    |kitti
      |training/
        |calib/
        |image_2/
        |label/
        |ImageSets/
      |testing/
        |calib/
        |image_2/
        |ImageSets/
  |kpts_ann
    |kpts_ann_train.json
    |kpts_ann_val.json

Training and evaluation pipeline

The whole pipeline including 3 parts: a) training DGDE first. b) using DGDE to generate needed data for GMW. c) training GMW and evaluate.

a) training DGDE Training with 2 GPUs.

cd DGDE
CUDA_VISIBLE_DEVICES=0,1 \
python tools/plain_train_net.py --batch_size 8 --config runs/DGDE.yaml \
--output output/DGDE --num_gpus 2 \

b) using DGDE to generate needed data for GMW. Finishing training for DGDE, please generate data on 1 GPU as:

cd DGDE
CUDA_VISIBLE_DEVICES=0 \
python tools/plain_train_net.py --batch_size 8 --config runs/DGDE.yaml \
--output output/DGDE --num_gpus 1 \
--generate_for_GMW \
--ckpt output/DGDE/model_final.pth

after this step, you could see gen_data_train.json and gen_data_infer.json in DGDE/gen_data/

c) training GMW and evaluate.

cd GMW
python -m torch.distributed.launch --master_port 33521 --nproc_per_node=4 \
main.py --log-dir ./logs/GMW \
-b 8 --lr 1e-4 --epoch 100 --val_freq 5 \
--train_data_path ../DGDE/gen_data/gen_data_train.json \
--val_data_path ../DGDE/gen_data/gen_data_infer.json

It will be evaluated periodically. You can also run the following command for evaluation:

python -m torch.distributed.launch --master_port 24281 --nproc_per_node=4 \
main.py --log-dir ./logs/GMW/
-b 36 -e \
--resume logs/GMW/checkpoint_epoch_100.pth.tar

You can also use the pre-trained weights of DGDE,WGM (Google Drive).

Acknowlegment

The code is mainly based on MonoFlex and BPnP. Thanks for their great work.

Download Source Code

Download ZIP

Paper Preview

Aug 16, 2022