speculative decoding herunterladen - speculative decoding Quellcode herunterladen

speculative decoding

AI-Quellcode

0.2.0

Herunterladen

Spekulative Dekodierung

Erkundungen einiger neuerer Techniken rund um die spekulative Dekodierung

Ich habe auch ein paar eigene Ideen, die ich versuchen werde, in diesem Repository zu teilen, wenn sie funktionieren. Ziel ist es, damit zunächst den Text-zu-Semantik-Decoder in Spear-TTS zu beschleunigen

Anerkennung

StabilitätAI und ? Huggingface für das großzügige Sponsoring und meine anderen Sponsoren dafür, dass sie mir die Unabhängigkeit gegeben haben, aktuelle Techniken der künstlichen Intelligenz als Open-Source-Lösung zu nutzen.

Todo

Zitate

 @inproceedings { Leviathan2022FastIF ,
    title   = { Fast Inference from Transformers via Speculative Decoding } ,
    author  = { Yaniv Leviathan and Matan Kalman and Y. Matias } ,
    booktitle = { International Conference on Machine Learning } ,
    year    = { 2022 } ,
    url     = { https://api.semanticscholar.org/CorpusID:254096365 }
}

 @inproceedings { sun2023spectr ,
    title     = { SpecTr: Fast Speculative Decoding via Optimal Transport } ,
    author    = { Ziteng Sun and Ananda Theertha Suresh and Jae Hun Ro and Ahmad Beirami and Himanshu Jain and Felix Yu and Michael Riley and Sanjiv Kumar } ,
    booktitle = { Workshop on Efficient Systems for Foundation Models @ ICML2023 } ,
    year      = { 2023 } ,
    url       = { https://openreview.net/forum?id=d0mGsaheuT }
}

 @article { Chen2023AcceleratingLL ,
    title     = { Accelerating Large Language Model Decoding with Speculative Sampling } ,
    author    = { Charlie Chen and Sebastian Borgeaud and Geoffrey Irving and Jean-Baptiste Lespiau and L. Sifre and John M. Jumper } ,
    journal   = { ArXiv } ,
    year      = { 2023 } ,
    volume    = { abs/2302.01318 } ,
    url       = { https://api.semanticscholar.org/CorpusID:256503945 }
}

 @article { Yan2020ProphetNetPF ,
    title   = { ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training } ,
    author  = { Yu Yan and Weizhen Qi and Yeyun Gong and Dayiheng Liu and Nan Duan and Jiusheng Chen and Ruofei Zhang and Ming Zhou } ,
    journal = { ArXiv } ,
    year    = { 2020 } ,
    volume  = { abs/2001.04063 } ,
    url     = { https://api.semanticscholar.org/CorpusID:210164665 }
}

 @article { Zhang2023DraftV ,
    title     = { Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding } ,
    author    = { Jinchao Zhang and Jue Wang and Huan Li and Lidan Shou and Ke Chen and Gang Chen and Sharad Mehrotra } ,
    journal   = { ArXiv } ,
    year      = { 2023 } ,
    volume    = { abs/2309.08168 } ,
    url       = { https://api.semanticscholar.org/CorpusID:262013673 }
}

 @misc { medusa ,
    author     = { Tianle Cai and Yuhong Li and Zhengyang Geng and Hongwu Peng and Tri Dao } ,
    title      = { Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads } ,
    year       = { 2023 } ,
    publisher  = { GitHub } ,
    journal    = { GitHub repository } ,
    howpublished = { url{https://github.com/FasterDecoding/Medusa} } ,
}