NoPoSplat Download - NoPoSplat Source code download

NoPoSplat

Other source code

Download

NoPoSplat predicts 3D Gaussians in a canonical space from unposed sparse images,
enabling high-quality novel view synthesis and accurate pose estimation.

Table of Contents

Installation
Pre-trained Checkpoints
Camera Conventions
Datasets
Running the Code
Acknowledgements
Citation

Installation

Our code relies on Python 3.10+, and is developed based on PyTorch 2.1.2 and CUDA 11.8, but it should work with higher Pytorch/CUDA versions as well.

Clone NoPoSplat.

git clone https://github.com/cvg/NoPoSplatcd NoPoSplat

Create the environment, here we show an example using conda.

conda create -y -n noposplat python=3.10
conda activate noposplat
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Optional, compile the cuda kernels for RoPE (as in CroCo v2).

# NoPoSplat relies on RoPE positional embeddings for which you can compile some cuda kernels for faster runtime.cd src/model/encoder/backbone/croco/curope/
python setup.py build_ext --inplacecd ../../../../../..

Pre-trained Checkpoints

Our models are hosted on Hugging Face ?

Model name	Training resolutions	Training data
re10k.ckpt	256x256	re10k
acid.ckpt	256x256	acid
mixRe10kDl3dv.ckpt	256x256	re10k, dl3dv
mixRe10kDl3dv_512x512.ckpt	512x512	re10k, dl3dv

We assume the downloaded weights are located in the pretrained_weights directory.

Camera Conventions

Our camera system is the same as pixelSplat. The camera intrinsic matrices are normalized (the first row is divided by image width, and the second row is divided by image height). The camera extrinsic matrices are OpenCV-style camera-to-world matrices ( +X right, +Y down, +Z camera looks into the screen).

Datasets

Please refer to DATASETS.md for dataset preparation.

Running the Code

Training

The main entry point is src/main.py. Call it via:

# 8 GPUs, with each batch size = 16. Remove the last two arguments if you don't want to use wandb for loggingpython -m src.main +experiment=re10k wandb.mode=online wandb.name=re10k

This default training configuration requires 8x GPUs with a batch size of 16 on each GPU (>=80GB memory). The training will take approximately 6 hours to complete. You can adjust the batch size to fit your hardware, but note that changing the total batch size may require modifying the initial learning rate to maintain performance. You can refer to the re10k_1x8 for training on 1 A6000 GPU (48GB memory), which will produce similar performance.

Evaluation

Novel View Synthesis

# RealEstate10Kpython -m src.main +experiment=re10k mode=test wandb.name=re10k dataset/[email protected]_sampler=evaluation dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json checkpointing.load=./pretrained_weights/re10k.ckpt test.save_image=true# RealEstate10Kpython -m src.main +experiment=acid mode=test wandb.name=acid dataset/[email protected]_sampler=evaluation dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json checkpointing.load=./pretrained_weights/acid.ckpt test.save_image=true

You can set wandb.name=SAVE_FOLDER_NAME to specify the saving path.

Pose Estimation

To evaluate the pose estimation performance, you can run the following command:

# RealEstate10Kpython -m src.eval_pose +experiment=re10k +evaluation=eval_pose checkpointing.load=./pretrained_weights/mixRe10kDl3dv.ckpt dataset/[email protected]_sampler=evaluation dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json# ACIDpython -m src.eval_pose +experiment=acid +evaluation=eval_pose checkpointing.load=./pretrained_weights/mixRe10kDl3dv.ckpt dataset/[email protected]_sampler=evaluation dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json# ScanNet-1500python -m src.eval_pose +experiment=scannet_pose +evaluation=eval_pose checkpointing.load=./pretrained_weights/mixRe10kDl3dv.ckpt

Note that here we show the evaluation using the mixed model trained on RealEstate10K and DL3DV. You can replace the checkpoint path with other trained models.

Acknowledgements

This project is developed with several fantastic repos: pixelSplat, DUSt3R, and CroCo. We thank the original authors for their excellent work. We thank the kindly help of David Charatan for providing the evaluation code and the pretrained models for some of the previous methods.

Citation

@article{ye2024noposplat,
      title   = {No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images},
      author  = {Ye, Botao and Liu, Sifei and Xu, Haofei and Xueting, Li and Pollefeys, Marc and Yang, Ming-Hsuan and Songyou, Peng},
      journal = {arXiv preprint arXiv:xxxx.xxxx},
      year    = {2024}
    }

Expand

Additional Information

Version
Type Other source code
Update Time 2024-11-01
size 50MB
From Github

Related Applications

waymo open dataset

2024-11-18
SmartTube

2024-12-14
Sunamu

2024-12-14
MySchedule.py

2024-12-15
viptools for eslam

2024-12-15
VITAident

2024-12-15

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
waymo open dataset

Other source code

December 2023 Update
SmartTube

Other source code

24.71 Stable
Sunamu

Other source code

Release 2.2.0
waymo open dataset

Other source code

December 2023 Update
termwind

Other categories

v2.3.0
wp functions

Other categories

1.0.0

Related Information All