ดาวน์โหลด Youku mPLUG - ดาวน์โหลดซอร์สโค้ด Youku mPLUG

Youku mPLUG

ซอร์สโค้ดอื่น ๆ

1.0.0

ดาวน์โหลด

Youku-mPLUG 10M ชุดข้อมูลข้อความวิดีโอขนาดใหญ่ของจีน

Youku-mPLUG: ชุดข้อมูลก่อนการฝึกอบรมภาษาจีนขนาดใหญ่ 10 ล้านชุดและเกณฑ์มาตรฐาน ลิงค์ดาวน์โหลดที่นี่

กระดาษ

ตัวอย่างสำหรับ youku-mplug

Youku-mPLUG คืออะไร?

เราเปิดตัวชุดข้อมูลวิดีโอคุณภาพสูงของจีนที่ใหญ่ที่สุดต่อสาธารณะ (10 ล้าน) ชื่อ Youku-mPLUG ซึ่งรวบรวมจากเว็บไซต์แบ่งปันวิดีโอชื่อดังของจีนชื่อ Youku โดยมีเกณฑ์ความปลอดภัย ความหลากหลาย และคุณภาพที่เข้มงวด

ตัวอย่างสำหรับ youku-mplug

ตัวอย่างคลิปวิดีโอและชื่อเรื่องในชุดข้อมูล Youku-mPLUG ที่เสนอ

เรามีชุดข้อมูลการวัดประสิทธิภาพวิดีโอดาวน์สตรีมต่อเนื่องหลายรูปแบบที่แตกต่างกัน 3 ชุดเพื่อวัดความสามารถของโมเดลที่ได้รับการฝึกอบรมล่วงหน้า 3 งานที่แตกต่างกัน ได้แก่ :

การทำนายหมวดหมู่วิดีโอ: เมื่อกำหนดวิดีโอและชื่อที่เกี่ยวข้อง ให้ทำนายหมวดหมู่ของวิดีโอ
การดึงข้อความวิดีโอ：เมื่อมีวิดีโอและข้อความบางส่วน ให้ใช้วิดีโอสำหรับการดึงข้อความและข้อความสำหรับการดึงวิดีโอ
คำบรรยายวิดีโอ：ต่อหน้าวิดีโอ ให้อธิบายเนื้อหาของวิดีโอ

ตัวอย่างสำหรับชุดข้อมูลดาวน์สตรีม youku-mplug

สถิติข้อมูล

ชุดข้อมูลประกอบด้วยวิดีโอทั้งหมด 10 ล้านวิดีโอซึ่งมีคุณภาพสูงและเผยแพร่ใน 20 หมวดหมู่สุดยอดและ 45 หมวดหมู่

สถิติ

การกระจายหมวดหมู่ในชุดข้อมูล Youku-mPLUG

ความสามารถในการยิงเป็นศูนย์

กรณีที่ 1 กรณีที่ 2

ดาวน์โหลด

คุณสามารถดาวน์โหลดไฟล์วิดีโอและคำอธิบายประกอบทั้งหมดผ่านลิงก์นี้

ตั้งค่า

หมายเหตุ: เนื่องจากข้อผิดพลาดใน megatron_util หลังจากติดตั้ง megatron_util จำเป็นต้องแทนที่ conda/envs/youku/lib/python3.10/site-packages/megatron_util/initialize.py ด้วย Initialize.py ในไดเร็กทอรีปัจจุบัน

 conda env create -f environment.yml
conda activate youku
pip install megatron_util==1.3.0 -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

# For caption evaluation
apt-get install default-jre

mPLUG-วิดีโอ (1.3B / 2.7B)

ฝึกล่วงหน้า

ก่อนอื่นคุณควรดาวน์โหลดจุดตรวจ GPT-3 1.3B & 2.7B จาก Modelscope สามารถดาวน์โหลดโมเดลก่อนการฝึกได้ที่นี่ (1.3B) และที่นี่ (2.7B)

ดำเนินการฝึกอบรมล่วงหน้าของ mPLUG-Video ดังนี้:

 exp_name = 'pretrain/gpt3_1.3B/pretrain_gpt3_freezeGPT_youku_v0'
PYTHONPATH = $ PYTHONPATH :. / 
python - m torch . distributed . launch - - nproc_per_node = 8 - - master_addr = $ MASTER_ADDR 
  - - master_port = $ MASTER_PORT 
  - - nnodes = $ WORLD_SIZE 
  - - node_rank = $ RANK 
  - - use_env run_pretrain_distributed_gpt3 . py 
  - - config . / configs / ${ exp_name }. yaml 
  - - output_dir . / output / ${ exp_name } 
  - - enable_deepspeed 
  - - bf16
  2 > & 1 | tee . / output / ${ exp_name } / train . log

การเปรียบเทียบ

เพื่อทำการปรับแต่งดาวน์สตรีมแบบละเอียด เราใช้การคาดการณ์หมวดหมู่วิดีโอเป็นตัวอย่าง:

 exp_name = 'cls/cls_gpt3_1.3B_youku_v0_sharp_2'
PYTHONPATH = $ PYTHONPATH :. / 
python - m torch . distributed . launch - - nproc_per_node = 8 - - master_addr = $ MASTER_ADDR 
  - - master_port = $ MASTER_PORT 
  - - nnodes = $ WORLD_SIZE 
  - - node_rank = $ RANK 
  - - use_env downstream / run_cls_distributed_gpt3 . py 
  - - config . / configs / ${ exp_name }. yaml 
  - - output_dir . / output / ${ exp_name } 
  - - enable_deepspeed 
  - - resume path / to / 1_3 B_mp_rank_00_model_states . pt 
  - - bf16
  2 > & 1 | tee . / output / ${ exp_name } / train . log

ผลการทดลอง

ด้านล่างนี้เราจะแสดงผลลัพธ์ในชุดการตรวจสอบเพื่อการอ้างอิง

ผลการทำนายหมวดหมู่วิดีโอในชุดการตรวจสอบ ผลลัพธ์การดึงวิดีโอในชุดการตรวจสอบ

mPLUG-วิดีโอ (BloomZ-7B)

เราสร้างโมเดล mPLUG-Video โดยใช้ mPLUG-Owl หากต้องการใช้โมเดล คุณควรโคลน repo mPLUG-Owl เป็นอันดับแรก

git clone https://github.com/X-PLUG/mPLUG-Owl.git
cd mPLUG-Owl/mPLUG-Owl

จุดตรวจสอบที่ปรับแต่งคำสั่งมีอยู่ใน HuggingFace สำหรับการปรับแต่งโมเดลอย่างละเอียด โปรดดูที่ mPLUG-Owl Repo ในการอนุมานวิดีโอ คุณสามารถใช้รหัสต่อไปนี้:

 import torch
from mplug_owl_video . modeling_mplug_owl import MplugOwlForConditionalGeneration
from transformers import AutoTokenizer
from mplug_owl_video . processing_mplug_owl import MplugOwlImageProcessor , MplugOwlProcessor

pretrained_ckpt = 'MAGAer13/mplug-youku-bloomz-7b'
model = MplugOwlForConditionalGeneration . from_pretrained (
    pretrained_ckpt ,
    torch_dtype = torch . bfloat16 ,
    device_map = { '' : 0 },
)
image_processor = MplugOwlImageProcessor . from_pretrained ( pretrained_ckpt )
tokenizer = AutoTokenizer . from_pretrained ( pretrained_ckpt )
processor = MplugOwlProcessor ( image_processor , tokenizer )

# We use a human/AI template to organize the context as a multi-turn conversation.
# <|video|> denotes an video placehold.
prompts = [
'''The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: <|video|>
Human: 视频中的女人在干什么？
AI: ''' ]

video_list = [ 'yoga.mp4' ]

# generate kwargs (the same in transformers) can be passed in the do_generate()
generate_kwargs = {
    'do_sample' : True ,
    'top_k' : 5 ,
    'max_length' : 512
}
inputs = processor ( text = prompts , videos = video_list , num_frames = 4 , return_tensors = 'pt' )
inputs = { k : v . bfloat16 () if v . dtype == torch . float else v for k , v in inputs . items ()}
inputs = { k : v . to ( model . device ) for k , v in inputs . items ()}
with torch . no_grad ():
    res = model . generate ( ** inputs , ** generate_kwargs )
sentence = tokenizer . decode ( res . tolist ()[ 0 ], skip_special_tokens = True )
print ( sentence )

อ้างถึง Youku-mPLUG

หากคุณพบว่าชุดข้อมูลนี้มีประโยชน์สำหรับการวิจัยของคุณ โปรดพิจารณาอ้างอิงรายงานของเรา

 @misc { xu2023youku_mplug ,
    title = { Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks } ,
    author = { Haiyang Xu, Qinghao Ye, Xuan Wu, Ming Yan, Yuan Miao, Jiabo Ye, Guohai Xu, Anwen Hu, Yaya Shi, Chenliang Li, Qi Qian, Que Maofei, Ji Zhang, Xiao Zeng, Fei Huang } ,
    year = { 2023 } ,
    eprint = { 2306.04362 } ,
    archivePrefix = { arXiv } ,
    primaryClass = { cs.CL }
}

ขยาย

ข้อมูลเพิ่มเติม

เวอร์ชัน 1.0.0
ประเภท ซอร์สโค้ดอื่น ๆ
เวลาอัปเดต 2024-12-13
ขนาด 15.45MB
มาจาก Github

แอปที่เกี่ยวข้อง

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub actions/download artifact

2024-11-01

แนะนำสำหรับคุณ

chat.petals.dev

ซอร์สโค้ดอื่น ๆ

1.0.0
GPT Prompt Templates

ซอร์สโค้ดอื่น ๆ

1.0.0
GPTyped

ซอร์สโค้ดอื่น ๆ

GPTyped 1.0.5
waymo open dataset

ซอร์สโค้ดอื่น ๆ

December 2023 Update
SmartTube

ซอร์สโค้ดอื่น ๆ

24.71 Stable
Sunamu

ซอร์สโค้ดอื่น ๆ

Release 2.2.0
waymo open dataset

ซอร์สโค้ดอื่น ๆ

December 2023 Update
termwind

หมวดหมู่อื่นๆ

v2.3.0
wp functions

หมวดหมู่อื่นๆ

1.0.0

ข้อมูลที่เกี่ยวข้อง ทั้งหมด