sarathi serve Télécharger - sarathi serve Téléchargement du code source

sarathi serve

Autre code source

1.0.0

Télécharger

Sarathi-Servir

Sarathi-Serve est un framework de service LLM à haut débit et à faible latence. Veuillez vous référer à notre article OSDI'24 pour plus de détails.

Installation

Configurer CUDA

Sarathi-Serve a été testé avec CUDA 12.3 sur les GPU H100 et A100.

Cloner le référentiel

git clone [email protected]:microsoft/sarathi-serve.git

Créer un environnement mamba

Configurez mamba si vous ne l'avez pas déjà,

wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
bash Mambaforge-Linux-x86_64.sh # follow the instructions from there

Créer un environnement Python 3.10,

mamba create -p ./env python=3.10

Installer Sarathi-Serve

pip install -e . --extra-index-url https://flashinfer.ai/whl/cu121/torch2.3/

Reproduction des résultats

Reportez-vous aux fichiers Lisez-moi dans les dossiers individuels correspondant à chaque figure dans osdi-experiments .

Citation

Si vous utilisez notre travail, pensez à citer notre article :

 @article{agrawal2024taming,
  title={Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve},
  author={Agrawal, Amey and Kedia, Nitin and Panwar, Ashish and Mohan, Jayashree and Kwatra, Nipun and Gulavani, Bhargav S and Tumanov, Alexey and Ramjee, Ramachandran},
  journal={Proceedings of 18th USENIX Symposium on Operating Systems Design and Implementation, 2024, Santa Clara},
  year={2024}
}

Reconnaissance

Ce référentiel a démarré à l'origine comme un fork du projet vLLM. Sarathi-Serve est un prototype de recherche et n'a pas de parité complète de fonctionnalités avec vLLM open source. Nous n'avons conservé que les fonctionnalités les plus critiques et adopté la base de code pour des itérations de recherche plus rapides.

Développer

Informations supplémentaires

Version 1.0.0
Type Autre code source
Date de mise à jour 2025-01-09
taille 253.84KB
Provenant de Github

Applications connexes

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub the via/releases

2024-11-01

Recommandé pour vous

chat.petals.dev

Autre code source

1.0.0
GPT Prompt Templates

Autre code source

1.0.0
GPTyped

Autre code source

GPTyped 1.0.5
waymo open dataset

Autre code source

December 2023 Update
Sunamu

Autre code source

Release 2.2.0
chat.petals.dev

Autre code source

1.0.0
waymo open dataset

Autre code source

December 2023 Update
termwind

Autres catégories

v2.3.0
wp functions

Autres catégories

1.0.0

Actualités connexes Tout