visual chatgpt下載 - visual chatgpt原始碼下載

visual chatgpt

其他源碼

1.0.0

下載

可視化聊天GPT

Visual ChatGPT連接 ChatGPT 和一系列 Visual Foundation 模型，以實現在聊天期間傳送和接收映像。

請參閱我們的論文：Visual ChatGPT：使用 Visual Foundation 模型進行對話、繪圖和編輯

更新：

新增自訂 GPU/CPU 分配
新增視窗支援
合併 HuggingFace ControlNet，刪除 download.sh
添加提示裝飾器
新增 HuggingFace 和 Colab 演示
清潔要求

洞察力和目標：

一方面， ChatGPT（或 LLM）充當通用介面，提供對廣泛主題的廣泛且多樣化的理解。另一方面，基礎模型透過提供特定領域的深入知識來充當領域專家。透過利用一般知識和深層知識，我們的目標是建構一個能夠處理各種任務的人工智慧。

示範

系統架構

快速入門

 # clone the repo
git clone https://github.com/microsoft/visual-chatgpt.git

# Go to directory
cd visual-chatgpt

# create a new environment
conda create -n visgpt python=3.8

# activate the new environment
conda activate visgpt

#  prepare the basic environments
pip install -r requirements.txt

# prepare your private OpenAI key (for Linux)
export OPENAI_API_KEY={Your_Private_Openai_Key}

# prepare your private OpenAI key (for Windows)
set OPENAI_API_KEY={Your_Private_Openai_Key}

# Start Visual ChatGPT !
# You can specify the GPU/CPU assignment by "--load", the parameter indicates which 
# Visual Foundation Model to use and where it will be loaded to
# The model and device are sperated by underline '_', the different models are seperated by comma ','
# The available Visual Foundation Models can be found in the following table
# For example, if you want to load ImageCaptioning to cpu and Text2Image to cuda:0
# You can use: "ImageCaptioning_cpu,Text2Image_cuda:0"

# Advice for CPU Users
python visual_chatgpt.py --load ImageCaptioning_cpu,Text2Image_cpu

# Advice for 1 Tesla T4 15GB  (Google Colab)                       
python visual_chatgpt.py --load "ImageCaptioning_cuda:0,Text2Image_cuda:0"
                                
# Advice for 4 Tesla V100 32GB                            
python visual_chatgpt.py --load "ImageCaptioning_cuda:0,ImageEditing_cuda:0,
    Text2Image_cuda:1,Image2Canny_cpu,CannyText2Image_cuda:1,
    Image2Depth_cpu,DepthText2Image_cuda:1,VisualQuestionAnswering_cuda:2,
    InstructPix2Pix_cuda:2,Image2Scribble_cpu,ScribbleText2Image_cuda:2,
    Image2Seg_cpu,SegText2Image_cuda:2,Image2Pose_cpu,PoseText2Image_cuda:2,
    Image2Hed_cpu,HedText2Image_cuda:3,Image2Normal_cpu,
    NormalText2Image_cuda:3,Image2Line_cpu,LineText2Image_cuda:3"

GPU 記憶體使用情況

這裡我們列出了每個 Visual Foundation 模型的 GPU 記憶體使用情況，您可以指定您喜歡哪一個：

基礎模型	GPU 記憶體 (MB)
圖像編輯	3981
指導Pix2Pix	2827
文字轉圖像	3385
圖片字幕	1209
圖像2Canny	0
CannyText2Image	3531
影像2線	0
行文字轉影像	3529
圖像2Hed	0
HedText2Image	3529
圖像2塗鴉	0
塗鴉文字轉圖像	3531
影像2姿勢	0
姿勢文字2圖像	3529
影像2段	919
分段文字2影像	3529
影像2深度	0
深度文字2影像	3531
影像2法線	0
普通文字轉圖像	3529
視覺問答	1495