scene language 다운로드 - scene language 소스 코드 다운로드

scene language

기타 소스코드

다운로드

장면 언어: 프로그램, 단어, 임베딩으로 장면 표현

arXiv | 프로젝트 페이지

Yunzhi Zhang, Zizhang Li, Matt Zhou, Shangzhe Wu, Jiajun Wu. arXiv 사전 인쇄 2024.

괴롭히는 사람

설치

환경

conda create --name sclg python=3.11 conda activate sclg pip install mitsuba # if you run into segmentation fault, you might need specific mitsuba versions # e.g., `pip install --force-reinstall mitsuba==3.5.1` on MacOS pip install unidecode Pillow anthropic transforms3d astor ipdb scipy jaxtyping imageio # required for minecraft renderer pip install spacy python -m spacy download en_core_web_md pip install --force-reinstall numpy==1.26.4 # to be compatible with transforms3d git clone https://github.com/zzyunzhi/scene-language.git cd scene-language pip install -e .

언어 모델 API

공식 문서에 따라 Anthropic API 키를 가져와서 engine/key.py 에 추가하세요.

ANTHROPIC_API_KEY = 'YOUR_ANTHROPIC_API_KEY' OPENAI_API_KEY = 'YOUR_OPENAI_API_KEY' # optional, required for `LLM_PROVIDER='gpt'`

기본적으로 Claude 3.5 Sonnet을 사용합니다. engine/constants.py 에서 LLM_PROVIDER 설정하여 다른 언어 모델로 전환할 수 있습니다.

텍스트 조건이 적용된 3D 생성

렌더러 : 미츠바

python scripts/run.py --tasks " a chessboard with a full set of chess pieces "

렌더링은 ${PROJ_ROOT}/scripts/outputs/run_${timestep}_${uuid}/${scene_name}_${uuid}/${sample_index}/renderings/*.gif 에 저장됩니다.

예제 결과(여기서는 원시 출력):

"체스 말 전체가 들어 있는 체스판"	"부분적으로 숫자가 채워진 9x9 스도쿠 보드"	"에곤 실레에게 영감을 받은 장면"	"로마 콜로세움"	"거미 인형"

렌더러: 마인크래프트

ENGINE_MODE=minecraft python scripts/run.py --tasks " a detailed cylindrical medieval tower "

생성된 장면은 ${PROJ_ROOT}/scripts/outputs/run_${timestep}_${uuid}/${scene_name}_${uuid}/${sample_index}/renderings/*.json 에 json 파일로 저장됩니다. 시각화를 위해 다음 명령을 실행합니다.

python viewers/minecraft/run.py

그런 다음 브라우저에서 http://127.0.0.1:5001을 열고 생성된 json 파일을 웹 페이지로 드래그합니다.

예제 결과(여기서는 원시 출력):

"할로윈 마녀의 집"	"상세한 원통형 중세 탑"	"피카츄의 상세모델"	"스톤헨지"	"그리스 사원"

이미지 조건이 적용된 3D 생성

python scripts/run.py --tasks ./resources/examples/ * --cond image --temperature 0.8

코드베이스 세부정보

매크로 정의

다음 표에는 DSL(도메인별 언어)에 정의된 표현식에 따라 이 파일에 정의된 도우미 함수가 나열되어 있습니다(논문의 표 2 및 5).

구현	DSL
`register`	`bind`
`library_call`	`call`
`primitive_call`	`call`
`loop`	`union-loop`
`concat_shapes`	`union`
`transform_shape`	`transform`
`rotation_matrix`	`rotation`
`translation_matrix`	`translate`
`scale_matrix`	`scale`
`reflection_matrix`	`reflect`
`compute_shape_center`	`compute-shape-center`
`compute_shape_min`	`compute-shape-min`
`compute_shape_max`	`compute-shape-max`
`compute_shape_sizes`	`compute-shape-sizes`

코드베이스 개선

현재 코드베이스를 사용하면 텍스트 또는 이미지 프롬프트가 포함된 3D 장면을 생성할 수 있습니다. 문서에 보고된 기타 작업 및 렌더러는 향후 업데이트에서 지원될 예정입니다.

기능 요청, 개선 제안이 있거나 결과를 공유하고 싶다면 PR을 제출하거나 이메일을 보내주세요.

소환

이 작업이 유용하다고 생각되면 다음 논문을 인용해 보세요.

@article { zhang2024scenelanguage , title = { The Scene Language: Representing Scenes with Programs, Words, and Embeddings } , author = { Yunzhi Zhang and Zizhang Li and Matt Zhou and Shangzhe Wu and Jiajun Wu } , year = { 2024 } , journal = { arXiv preprint arXiv:2410.16770 } , }