Skip to content

框架vllm输出截断,但是官方vllm启动和transformers运行模型都不 #314

@TLL1213

Description

@TLL1213

提交前必须检查以下项目 | The following items must be checked before submission

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

PORT=6006

model related

MODEL_NAME=qwen2
MODEL_PATH=qwen2-7B-Instruct
PROMPT_NAME=qwen2

own

MAX_NUM_SEQS=4096
CONTEXT_LEN = 4096

rag related

EMBEDDING_NAME=
RERANK_NAME=

api related

API_PREFIX=/v1

vllm related

ENGINE=vllm
TRUST_REMOTE_CODE=true
TOKENIZE_MODE=auto
TENSOR_PARALLEL_SIZE=1
DTYPE=auto

TASKS=llm

TASKS=llm,rag

上面是运行的配置文件,我尝试过使用transformers运行,也尝试过命令python -m vllm.entrypoints.openai.api_server --model qwen2-7B-Instruct --port 8080 --served-model-name qwen2运行,后面二者都不会产生截断问题,当使用该项目启动时,便会存在截断问题,大概生成六百字左右就开始截断,模型是我微调过的模型,主要任务是生成长文本。

Dependencies

vllm 0.4.3

运行日志或截图 | Runtime logs or screenshots

甲乙双方各持一份,具有

我截取了最后截断的不烦,“具有”两个字之后输出突然戛然而止

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions