Skip to main content

使用 NVIDIA Triton 进行部署

Triton 推理服务器提供了一个教程，演示了如何快速部署一个简单的、使用 vLLM 的 facebook/opt-125m 模型。请参阅在 Triton 中部署 vLLM 模型了解更多详情。