框架vllm输出截断，但是官方vllm启动和transformers运行模型都不

### 提交前必须检查以下项目 | The following items must be checked before submission

- [X] 请确保使用的是仓库最新代码（git pull），一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
- [X] 我已阅读[项目文档](https://github.com/xusenlinzy/api-for-open-llm/blob/master/README.md)和[FAQ章节](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/FAQ.md)并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 | I have searched the existing issues / discussions

### 问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

### 操作系统 | Operating system

Linux

### 详细描述问题 | Detailed description of the problem

PORT=6006

# model related
MODEL_NAME=qwen2
MODEL_PATH=qwen2-7B-Instruct
PROMPT_NAME=qwen2
# own
MAX_NUM_SEQS=4096
CONTEXT_LEN = 4096

# rag related
EMBEDDING_NAME=
RERANK_NAME=

# api related
API_PREFIX=/v1

# vllm related
ENGINE=vllm
TRUST_REMOTE_CODE=true
TOKENIZE_MODE=auto
TENSOR_PARALLEL_SIZE=1
DTYPE=auto

TASKS=llm
# TASKS=llm,rag

上面是运行的配置文件，我尝试过使用transformers运行，也尝试过命令python -m vllm.entrypoints.openai.api_server --model qwen2-7B-Instruct --port 8080 --served-model-name qwen2运行，后面二者都不会产生截断问题，当使用该项目启动时，便会存在截断问题，大概生成六百字左右就开始截断，模型是我微调过的模型，主要任务是生成长文本。



### Dependencies

vllm                              0.4.3


### 运行日志或截图 | Runtime logs or screenshots

甲乙双方各持一份，具有

我截取了最后截断的不烦，“具有”两个字之后输出突然戛然而止

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

框架vllm输出截断，但是官方vllm启动和transformers运行模型都不 #314

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

操作系统 | Operating system

详细描述问题 | Detailed description of the problem

model related

own

rag related

api related

vllm related

TASKS=llm,rag

Dependencies

运行日志或截图 | Runtime logs or screenshots

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

框架vllm输出截断，但是官方vllm启动和transformers运行模型都不 #314

Description

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

操作系统 | Operating system

详细描述问题 | Detailed description of the problem

model related

own

rag related

api related

vllm related

TASKS=llm,rag

Dependencies

运行日志或截图 | Runtime logs or screenshots

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions