Eldar Insafutdinov and Alexey Dosovitskiy.
Neural Information Processing Systems, 2018.
Website: https://eldar.insafutdinov.com/PointClouds
Paper: https://arxiv.org/abs/1810.09381
The code is in Python 3.6 and Tensorflow >= 1.11. We provide instructions on how to install all dependencies with conda, but using pip should also work. We suggest downloading miniconda for python 3.
Create Python 3.6 environment (some packages are not yet available for the more recent python 3.7):
conda create -n py36 python=3.6
conda activate py36
Install basic dependencies:
conda install numpy scikit-image pillow scipy scikit-learn pyyaml
pip install easydict
Install open3d (more information can be found on the official website):
conda install -c open3d-admin open3d
Install TensorFlow:
pip install tensorflow-gpu
We use Shapenet v1 for all experiments in the paper.
For convenience, we provide the rendered images, used for training, for the chair, car and airplane classes of the ShapeNet dataset (category ID 03001627
, 02958343
and 02691156
respectively):
cd data
./download_train_data.sh 03001627
Convert training images to TFRecords format:
./create_tf_records.sh 03001627
Download pre-computed ground truth point clouds for evaluation:
./download_ground_truth.sh 03001627
You can also generate ground truth yourself as described here.
To train and evaluate the full model without camera pose supervision execute the following:
cd experiments/chair_unsupervised
# train and compute accuracy on the validation set
python ../../dpc/run/train_eval.py
# compute accuracy on the test set
python ../../dpc/run/predict_eval.py --eval_split=test
You can use a --gpu
flag to specify an ID of the GPU you want to run on.
The file chamfer_pred_test.txt
contains the accuracy of 3D shape reconstruction represented by the two quantities: coverage and precision. Chamfer distance metric is the sum of precision and coverage. More details on the evaluation metric can be found in the paper.
The file pose_error_pred_test.txt
contains camera pose estimation error. The first quantity is accuracy at the 30° threshold and the second one is the median error in degrees.
A configuration file to train a model with camera pose supervision is located in experiments/chair_camera_supervision
.
We provide a Jupyter notebook to visualise predicted shapes. The rendering code uses Blender, which you can install in the external/
subdirectory under name blender
or simply create a symlink, for example:
ln -s /path/to/blender-2.79b external/blender
After, you can lunch the notebook experiments/visualise.ipynb
.
@inproceedings{insafutdinov18pointclouds,
title = {Unsupervised Learning of Shape and Pose with Differentiable Point Clouds},
author = {Insafutdinov, Eldar and Dosovitskiy, Alexey},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2018}
}