ดาวน์โหลด dialogue generation - ดาวน์โหลดซอร์สโค้ด dialogue generation

dialogue generation

โค้ดแหล่งที่มา AI

1.0.0

ดาวน์โหลด

การสร้างบทสนทนา

การใช้โมเดลตัวสร้างบทสนทนาแบบนิวรัลกับ XLNet Yang และคณะ ที่ได้รับการฝึกล่วงหน้า (2019) และสถาปัตยกรรม GPT2 Radford และคณะ (2019) บนชุดข้อมูลสามชุดในปัจจุบัน: DailyDialog Li et al. (2017) , PersonaChat Zhang และคณะ (2018) และ TopicalChat Gopalakrishnan และคณะ ใหม่ (2019) จาก Alexa Prize Socialbot Grand Challenge 3. Top-k sampling Fan และคณะ (2018) และการถอดรหัสนิวเคลียส Holtzman และคณะ (2019) มีให้เป็นเทคนิคการถอดรหัส วัตถุประสงค์ของการฝึกอบรมคือการสร้างแบบจำลองภาษาแบบถอยหลังอัตโนมัติเกี่ยวกับคำพูดและประวัติบทสนทนา

การติดตั้ง

โมเดลนี้สามารถใช้ประโยชน์จากการฝึกความแม่นยำแบบผสมจาก nvidia/apex โปรดทราบว่าไม่จำเป็นต้องใช้เอเพ็กซ์ และใช้เฉพาะเมื่อมีให้ใช้งานเท่านั้น สำหรับคู่มือการติดตั้ง โปรดดูคำแนะนำอย่างเป็นทางการ การใช้โมดูลนี้ไม่มีประโยชน์สำหรับ GPU ทั้งหมด (เฉพาะ Volta และ Turing) และคุณควรตรวจสอบก่อนว่าอินสแตนซ์ของคุณรองรับการฝึกความแม่นยำแบบผสมหรือไม่

เพื่อฝึกโมเดลให้โคลนพื้นที่เก็บข้อมูลนี้และติดตั้งการขึ้นต่อกัน โปรเจ็กต์นี้ใช้ Cython เพื่อรวบรวมแบทช์เพื่อให้ไปป์ไลน์อินพุตเร็วขึ้น นอกจากนี้ยังต้องการใช้ python virtualenv

 git clone https://github.com/bme-chatbots/dialogue-generation.git

cd dialogue-generation

pip install -r requirements.txt

python setup.py build_ext --inplace

การฝึกอบรม

คำสั่งต่อไปนี้จะเริ่มการฝึกบน GPU/CPU ตัวเดียวที่มีรุ่น gpt2-medium บน PersonaChat --name คือชื่อของไดเร็กทอรีย่อยในโฟลเดอร์โมเดล ซึ่งเป็นที่เก็บบันทึกและจุดตรวจสอบ

 python -m src.train --model gpt2-medium --data personachat --name my_test_run

สำหรับการฝึกอบรม multi-gpu แบบกระจาย สคริปต์รถไฟควรเรียกเช่นนี้

 python -m torch.distributed.launch --nproc_per_node=NUM_GPUS src/train.py --model gpt2

คุณยังสามารถใช้การกำหนดค่าที่กำหนดไว้ล่วงหน้าได้โดยส่งเส้นทางของไฟล์ config json เป็นอาร์กิวเมนต์ --config สิ่งเหล่านี้มีอยู่ในโฟลเดอร์ src/configs และผลการฝึกสามารถดูได้ด้านล่างส่วนผลลัพธ์

 python -m src.train --config src/configs/xlnet-dailydialog.json

การฝึกโมเดลนั้นรวดเร็วและง่ายดายบน Google Colaboratory หรือ Kaggle kernel สิ่งสำคัญคือต้องตั้งค่าประเภทรันไทม์เป็น GPU ด้วยหน่วย Tesla P100 หรือ Tesla T4 ใหม่ เนื่องจากสามารถใช้ประโยชน์จากการฝึกแบบผสมความแม่นยำได้อย่างเต็มที่ และเร็วกว่าเวอร์ชัน Tesla K80 รุ่นเก่ามาก คุณสามารถตรวจสอบประเภทปัจจุบันได้ด้วยการเรียกใช้ !nvidia-smi ในเซลล์ของ colab

ทางลัดนี้เป็นตัวอย่างส่วนสำคัญทั้งหมด ซึ่งคุณสามารถนำเข้าไปยัง Google Drive ของคุณเป็นไฟล์สำหรับการทำงานร่วมกันได้

คัดลอกและเรียกใช้โค้ดต่อไปนี้ในเซลล์ของไฟล์ colab (หรือ Kaggle kernel ) เพื่อติดตั้งโมเดล หากคุณใช้เคอร์เนล Kaggle คุณต้องเปิดใช้งานการเข้าถึงอินเทอร์เน็ตด้วย

 ! git clone https://github.com/bme-chatbots/dialogue-generation.git
! python -m pip install --upgrade pip

# installing apex is optional and is only useful if Colab's Tesla P100 or T4 is used
# !git clone https://github.com/NVIDIA/apex
# !cd apex; pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .

# building the cython code and installing the required packages
! cd dialogue-generation ; pip install -r requirements.txt ; python setup.py build_ext --inplace

เมตริกการฝึกและการตรวจสอบจะบันทึกลงใน Tensorboard ซึ่งสามารถติดตามได้ในไฟล์ colab เช่นกัน หากโค้ดด้านล่างทำงานก่อนเซลล์การฝึก

%load_ext tensorboard

%tensorboard --logdir " dialogue-generation/model "

โมเดลสามารถถูกฝึกได้เพียงแค่รันสคริปต์ train ด้วยแฟล็กเริ่มต้น คุณสามารถดูแฟล็กทั้งหมดที่สคริปต์ train.py ยอมรับได้โดยระบุแฟล็ก -h

 ! cd dialogue-generation ; python -m src.train

หลังจากการฝึก คุณจะสามารถดาวน์โหลดโมเดลได้โดยการตั้งค่าลิงก์ดาวน์โหลดในตัวอย่างต่อไปนี้เป็นลิงก์ที่บันทึกไว้โดยสคริปต์หลังการประเมิน ( Saving model to dialogue-generation/src/../model/gpt2/19.11.03-12:59:47/model.pt )

 from IPython . display import FileLink

# note that in case of kaggle kernel you have to give path
# relative to your working directory
FileLink ( r'dialogue-generation/src/../model/gpt2/19.11.03-12:59:47/model.pt' )

ปฏิสัมพันธ์

โหมดการประเมินเชิงโต้ตอบมีให้ใช้งานในโมเดลที่ได้รับการฝึกโดยการรันสคริปต์ interact และจัดเตรียมพาธของโมเดลที่ได้รับการฝึกด้วย --model_file คุณยังสามารถจัดเตรียมไฟล์ --config หรือเพียงแค่ให้อาร์กิวเมนต์ --model และ --name เดียวกันซึ่งใช้ระหว่างการฝึก

 python -m src.interact --model gpt2-medium --name my_test_run

 python -m src.interact --config src/configs/xlnet-dailydialog.json

การปรับแต่ง

หากต้องการฝึกโมเดลใดๆ บนชุดข้อมูลของคุณเอง คุณเพียงแค่ต้องคลาสย่อยจาก DialogDataset และใช้การสร้างข้อมูลจากไฟล์ดิบ เมื่อกำหนด train.txt , valid.txt และ test.txt ไว้ใน data<name of your data> โดยที่แต่ละรอบในกล่องโต้ตอบจะอยู่ในบรรทัดใหม่และกล่องโต้ตอบที่แยกจากกันจะถูกหารด้วยบรรทัดว่างเพิ่มเติม

 ├── data
|   ├── dailydialog
|   └── custom_dataset   # name of your data ( custom_dataset by default )
|       ├── train.txt    # the correctly formated train, valid and test files
|       ├── valid.txt
|       └── test.txt
├── src

 Hello how are you?
Hi I'm fine thanks. And you?
Me too thanks for asking.

Hi my name Peter.
Nice to meet you I am Eric.

ตัวอย่างคลาสชุดข้อมูลที่กำหนดเองชื่อ CustomDataset ถูกนำมาใช้ใน data.py ที่อ่านชุดข้อมูลที่มีคุณสมบัติเหล่านี้

ผลลัพธ์

กล่องโต้ตอบด้านล่างสุ่มตัวอย่างจาก gpt2 ที่ได้รับการฝึกด้วยพารามิเตอร์เริ่มต้นสำหรับ 7 ยุค ( 149051 ขั้นตอน ) และ --max_hist 3 บนชุดข้อมูล PersonaChat

 User: hi

Bot: greetings, i'm gerald.

User: how are you ?

Bot: good, thanks. getting ready for another hike, i love hiking!

User: oh realy? where are you going?

Bot: pikes peak. i am curently exploring famous mountains when i have time.

User: that is cool are you going alone?

Bot: my brother is in the us, we go hiking with him every month.

User: how old are you by the way?

Bot: a little under 46 years old

User: are you in a relationship?

Bot: yes i am married, 3 boys, i am pregnant again. i am too busy to focus

กล่องโต้ตอบด้านล่างสุ่มตัวอย่างจาก gpt2 ที่ได้รับการฝึกด้วยพารามิเตอร์เริ่มต้นสำหรับ 8 ยุค ( 48824 ขั้นตอน ) และ --max_hist 3 บนชุดข้อมูล DailyDialog

 User: hi how are you ?

Bot: fine. i got a promotion at work earlier this week.

User: where do you work ?

Bot: I work at the polling place.

User: where is that ?

Bot: I live in the dorms across the street.

User: do you like it ?

Bot: it's great. I heard it is far from here.

User: are you a student ?

Bot: yes. I am a student too.

ขยาย

ข้อมูลเพิ่มเติม

เวอร์ชัน 1.0.0
ประเภท โค้ดแหล่งที่มา AI
เวลาอัปเดต 2024-12-18
ขนาด 30.7KB
มาจาก Github

แอปที่เกี่ยวข้อง

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
ความท้าทายของ Generation Zero CODEX

2022-11-02
Generation Zero - ความไม่สงบบนเทือกเขาแอลป์

2022-08-20

แนะนำสำหรับคุณ

chat.petals.dev

ซอร์สโค้ดอื่น ๆ

1.0.0
GPT Prompt Templates

ซอร์สโค้ดอื่น ๆ

1.0.0
GPTyped

ซอร์สโค้ดอื่น ๆ

GPTyped 1.0.5
node telegram bot api

โค้ดแหล่งที่มา AI

v0.50.0
typebot.io

โค้ดแหล่งที่มา AI

v3.1.2
python wechaty getting started

โค้ดแหล่งที่มา AI

1.0.0
waymo open dataset

ซอร์สโค้ดอื่น ๆ

December 2023 Update
termwind

หมวดหมู่อื่นๆ

v2.3.0
wp functions

หมวดหมู่อื่นๆ

1.0.0

ข้อมูลที่เกี่ยวข้อง ทั้งหมด