This document provides an overview of TensorRT, a high-performance AI inference engine from NVIDIA. It offers cross-platform capabilities, supports various programming interfaces, and allows for custom plugin development. Below are details about its features and associated resources.
TensorRT 是Nvidia 推出的跨 nv-gpu架构的半开源高性能AI 推理引擎框架/库,提供了cpp/python接口,以及用户自定义插件方法,涵盖了AI 推理引擎技术的主要方面。
TensorRT is a semi-open source high-performance AI inference engine framework/library developed by Nvidia, which spans across nv-gpu architectures.
Provides cpp/python interfaces and user-defined plugin methods, covering the main aspects of AI inference engine technology.
Reference
https://docs.nvidia.com/deeplearning/tensorrt/archives/
https://developer.nvidia.com/search?page=1&sort=relevance&term=
https://github.com/HeKun-NVIDIA/TensorRT-DeveloperGuidein_Chinese/tree/main
https://docs.nvidia.com/deeplearning/tensorrt/migration-guide/index.html
https://developer.nvidia.com/zh-cn/blog/nvidia-gpu-fp8-training-inference/