meshgpt pytorch下载 - meshgpt pytorch源码下载

MeshGPT - Pytorch

在 Pytorch 中使用 Attention 实现 MeshGPT、SOTA 网格生成

还将添加文本调节，以实现最终的文本转 3D 资源

如果您有兴趣与其他人合作复制这项工作，请加入

更新：马库斯已经训练了一个工作模型并将其上传到？抱脸！

欣赏

StabilityAI、A16Z 开源 AI 资助计划，以及？感谢慷慨的赞助，以及我的其他赞助商，为我提供了开源当前人工智能研究的独立性
Einops 让我的生活变得轻松
Marcus 负责初始代码审查（指出一些缺失的派生功能）以及运行第一个成功的端到端实验
马库斯首次成功训练了以标签为条件的一组形状
Quexi Ma 通过自动 eos 处理发现了大量错误
Yingtian 发现空间标签平滑位置高斯模糊的错误
马库斯再次运行实验来验证可以将系统从三角形扩展到四边形
马库斯（Marcus）发现了文本调节问题并进行了所有导致该问题得到解决的实验

安装

$ pip install meshgpt-pytorch

用法

 import torch

from meshgpt_pytorch import (
    MeshAutoencoder ,
    MeshTransformer
)

# autoencoder

autoencoder = MeshAutoencoder (
    num_discrete_coors = 128
)

# mock inputs

vertices = torch . randn (( 2 , 121 , 3 ))            # (batch, num vertices, coor (3))
faces = torch . randint ( 0 , 121 , ( 2 , 64 , 3 ))      # (batch, num faces, vertices (3))

# make sure faces are padded with `-1` for variable lengthed meshes

# forward in the faces

loss = autoencoder (
    vertices = vertices ,
    faces = faces
)

loss . backward ()

# after much training...
# you can pass in the raw face data above to train a transformer to model this sequence of face vertices

transformer = MeshTransformer (
    autoencoder ,
    dim = 512 ,
    max_seq_len = 768
)

loss = transformer (
    vertices = vertices ,
    faces = faces
)

loss . backward ()

# after much training of transformer, you can now sample novel 3d assets

faces_coordinates , face_mask = transformer . generate ()

# (batch, num faces, vertices (3), coordinates (3)), (batch, num faces)
# now post process for the generated 3d asset

对于文本条件 3D 形状合成，只需在MeshTransformer上设置condition_on_text = True ，然后将描述列表作为texts关键字参数传递

前任。

 transformer = MeshTransformer (
    autoencoder ,
    dim = 512 ,
    max_seq_len = 768 ,
    condition_on_text = True
)


loss = transformer (
    vertices = vertices ,
    faces = faces ,
    texts = [ 'a high chair' , 'a small teapot' ],
)

loss . backward ()

# after much training of transformer, you can now sample novel 3d assets conditioned on text

faces_coordinates , face_mask = transformer . generate (
    texts = [ 'a long table' ],
    cond_scale = 8. ,  # a cond_scale > 1. will enable classifier free guidance - can be placed anywhere from 3. - 10.
    remove_parallel_component = True # from https://arxiv.org/abs/2410.02416
)

如果您想对网格进行标记，以便在多模态转换器中使用，只需在自动编码器上调用.tokenize （或者在自动编码器训练器实例上调用指数平滑模型的相同方法）

 mesh_token_ids = autoencoder . tokenize (
    vertices = vertices ,
    faces = faces
)

# (batch, num face vertices, residual quantized layer)

类型检查

在项目根目录下，运行

$ cp .env.sample .env

托多

引文

 @inproceedings { Siddiqui2023MeshGPTGT ,
    title   = { MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers } ,
    author  = { Yawar Siddiqui and Antonio Alliegro and Alexey Artemov and Tatiana Tommasi and Daniele Sirigatti and Vladislav Rosov and Angela Dai and Matthias Nie{ss}ner } ,
    year    = { 2023 } ,
    url     = { https://api.semanticscholar.org/CorpusID:265457242 }
}

 @inproceedings { dao2022flashattention ,
    title   = { Flash{A}ttention: Fast and Memory-Efficient Exact Attention with {IO}-Awareness } ,
    author  = { Dao, Tri and Fu, Daniel Y. and Ermon, Stefano and Rudra, Atri and R{'e}, Christopher } ,
    booktitle = { Advances in Neural Information Processing Systems } ,
    year    = { 2022 }
}

 @inproceedings { Leviathan2022FastIF ,
    title   = { Fast Inference from Transformers via Speculative Decoding } ,
    author  = { Yaniv Leviathan and Matan Kalman and Y. Matias } ,
    booktitle = { International Conference on Machine Learning } ,
    year    = { 2022 } ,
    url     = { https://api.semanticscholar.org/CorpusID:254096365 }
}

 @misc { yu2023language ,
    title   = { Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation } , 
    author  = { Lijun Yu and José Lezama and Nitesh B. Gundavarapu and Luca Versari and Kihyuk Sohn and David Minnen and Yong Cheng and Agrim Gupta and Xiuye Gu and Alexander G. Hauptmann and Boqing Gong and Ming-Hsuan Yang and Irfan Essa and David A. Ross and Lu Jiang } ,
    year    = { 2023 } ,
    eprint  = { 2310.05737 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CV }
}

 @article { Lee2022AutoregressiveIG ,
    title   = { Autoregressive Image Generation using Residual Quantization } ,
    author  = { Doyup Lee and Chiheon Kim and Saehoon Kim and Minsu Cho and Wook-Shin Han } ,
    journal = { 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) } ,
    year    = { 2022 } ,
    pages   = { 11513-11522 } ,
    url     = { https://api.semanticscholar.org/CorpusID:247244535 }
}

 @inproceedings { Katsch2023GateLoopFD ,
    title   = { GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling } ,
    author  = { Tobias Katsch } ,
    year    = { 2023 } ,
    url     = { https://api.semanticscholar.org/CorpusID:265018962 }
}