SGLang is a fast serving framework for large language models and vision language models.
cuda inference pytorch transformer openai moe llama vlm kimi blackwell llm llm-serving llava deepseek llama3 deepseek-v3 deepseek-r1 qwen3 gpt-oss deepseek-v3-2
-
Updated
Sep 27, 2025 - Python