sarathi serve Download - sarathi serve download do código-fonte

sarathi serve

Outro código-fonte

1.0.0

Baixar

Sarathi-Servir

Sarathi-Serve é uma estrutura de serviço LLM de alto rendimento e baixa latência. Consulte nosso artigo OSDI'24 para obter mais detalhes.

Configurar

Configurar CUDA

Sarathi-Serve foi testado com CUDA 12.3 em GPUs H100 e A100.

Clonar repositório

git clone [email protected]:microsoft/sarathi-serve.git

Criar ambiente mamba

Configure o mamba se ainda não o tiver,

wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
bash Mambaforge-Linux-x86_64.sh # follow the instructions from there

Crie um ambiente Python 3.10,

mamba create -p ./env python=3.10

Instale o Sarathi-Serve

pip install -e . --extra-index-url https://flashinfer.ai/whl/cu121/torch2.3/

Reproduzindo Resultados

Consulte os leia-mes em pastas individuais correspondentes a cada figura em osdi-experiments .

Citação

Se você usar nosso trabalho, considere citar nosso artigo:

 @article{agrawal2024taming,
  title={Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve},
  author={Agrawal, Amey and Kedia, Nitin and Panwar, Ashish and Mohan, Jayashree and Kwatra, Nipun and Gulavani, Bhargav S and Tumanov, Alexey and Ramjee, Ramachandran},
  journal={Proceedings of 18th USENIX Symposium on Operating Systems Design and Implementation, 2024, Santa Clara},
  year={2024}
}

Reconhecimento

Este repositório começou originalmente como uma bifurcação do projeto vLLM. Sarathi-Serve é um protótipo de pesquisa e não possui paridade completa de recursos com o vLLM de código aberto. Mantivemos apenas os recursos mais críticos e adotamos a base de código para iterações de pesquisa mais rápidas.

Expandir

Informações adicionais

Versão 1.0.0
Tipo Outro código-fonte
Data da Última Atualização 2025-01-09
tamanho 253.84KB
Vindo de Github

Aplicativos Relacionados

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub the via/releases

2024-11-01

Recomendado para você

chat.petals.dev

Outro código-fonte

1.0.0
GPT Prompt Templates

Outro código-fonte

1.0.0
GPTyped

Outro código-fonte

GPTyped 1.0.5
waymo open dataset

Outro código-fonte

December 2023 Update
SmartTube

Outro código-fonte

24.71 Stable
Sunamu

Outro código-fonte

Release 2.2.0
waymo open dataset

Outro código-fonte

December 2023 Update
wp functions

Outras categorias

1.0.0
termwind

Outras categorias

v2.3.0

Informações Relacionadas Todos