indonlg ดาวน์โหลด - ดาวน์โหลด indonlg ซอร์สโค้ด

indonlg

ซอร์สโค้ดอื่น ๆ

1.0.0

ดาวน์โหลด

อินโดเอ็นแอลจี

Baca README เป็นภาษาบาฮาซาอินโดนีเซีย

อัปเดต 16/11/2024: เราอัปเดตลิงก์ไปยังชุดข้อมูลและโมเดลข้อความด่วนใน IndoNLG!

IndoNLG คือชุดทรัพยากรการสร้างภาษาธรรมชาติ (NLG) สำหรับ Bahasa Indonesia โดยมีงานต่อเนื่อง 6 ประเภท เราจัดเตรียมโค้ดเพื่อสร้างผลลัพธ์และโมเดลที่ได้รับการฝึกล่วงหน้าขนาดใหญ่ ( IndoBART และ IndoGPT ) ที่ได้รับการฝึกด้วยคลังคำประมาณ 4 พันล้านคำ ( Indo4B-Plus ) ข้อมูลข้อความประมาณ ~25 GB โครงการนี้เริ่มต้นโดยความร่วมมือร่วมกันระหว่างมหาวิทยาลัยและอุตสาหกรรม เช่น Institut Teknologi Bandung, Universitas Multimedia Nusantara, The Hong Kong University of Science and Technology, Universitas Indonesia, DeepMind, Gojek และ Prosa.AI

บทความวิจัย

IndoNLG ได้รับการยอมรับจาก EMNLP 2021 และคุณสามารถดูรายละเอียดได้ในรายงานของเรา https://aclanthology.org/2021.emnlp-main.699 หากคุณใช้ส่วนประกอบใดๆ บน IndoNLG รวมถึง Indo4B-Plus, IndoBART หรือ IndoGPT ในงานของคุณ โปรดอ้างอิงเอกสารต่อไปนี้:

 @inproceedings{cahyawijaya-etal-2021-indonlg,
    title = "{I}ndo{NLG}: Benchmark and Resources for Evaluating {I}ndonesian Natural Language Generation",
    author = "Cahyawijaya, Samuel and Winata, Genta Indra and Wilie, Bryan and Vincentio, Karissa and Li, Xiaohong and Kuncoro, Adhiguna and Ruder, Sebastian and Lim, Zhi Yuan and Bahar, Syafri and Khodra, Masayu and Purwarianti, Ayu and Fung, Pascale",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.699",
    pages = "8875--8898",
}