Descargar sarathi serve - sarathi serve Descarga del código fuente

sarathi serve

Otro código fuente

1.0.0

Descargar

Sarathi-Servir

Sarathi-Serve es un marco de servicio LLM de alto rendimiento y baja latencia. Consulte nuestro documento OSDI'24 para obtener más detalles.

Configuración

Configurar CUDA

Sarathi-Serve se probó con CUDA 12.3 en GPU H100 y A100.

Repositorio de clones

git clone [email protected]:microsoft/sarathi-serve.git

Crear ambiente mamba

Configura mamba si aún no lo tienes,

wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
bash Mambaforge-Linux-x86_64.sh # follow the instructions from there

Cree un entorno Python 3.10,

mamba create -p ./env python=3.10

Instalar Sarathi-Servir

pip install -e . --extra-index-url https://flashinfer.ai/whl/cu121/torch2.3/

Reproducción de resultados

Consulte los archivos Léame en las carpetas individuales correspondientes a cada figura en osdi-experiments .

Citación

Si utiliza nuestro trabajo, considere citar nuestro artículo:

 @article{agrawal2024taming,
  title={Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve},
  author={Agrawal, Amey and Kedia, Nitin and Panwar, Ashish and Mohan, Jayashree and Kwatra, Nipun and Gulavani, Bhargav S and Tumanov, Alexey and Ramjee, Ramachandran},
  journal={Proceedings of 18th USENIX Symposium on Operating Systems Design and Implementation, 2024, Santa Clara},
  year={2024}
}

Reconocimiento

Este repositorio comenzó originalmente como una bifurcación del proyecto vLLM. Sarathi-Serve es un prototipo de investigación y no tiene una paridad completa de funciones con vLLM de código abierto. Solo conservamos las características más críticas y adoptamos el código base para iteraciones de investigación más rápidas.

Expandir

Información adicional

Versión 1.0.0
Tipo Otro código fuente
Fecha de actualización 2025-01-09
tamaño 253.84KB
Proviene de Github

Aplicaciones relacionadas

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub the via/releases

2024-11-01

Recomendado para ti

chat.petals.dev

Otro código fuente

1.0.0
GPT Prompt Templates

Otro código fuente

1.0.0
GPTyped

Otro código fuente

GPTyped 1.0.5
waymo open dataset

Otro código fuente

December 2023 Update
SmartTube

Otro código fuente

24.71 Stable
Sunamu

Otro código fuente

Release 2.2.0
waymo open dataset

Otro código fuente

December 2023 Update
wp functions

Otras categorias

1.0.0
termwind

Otras categorias

v2.3.0

Información relacionada Todo