-
Notifications
You must be signed in to change notification settings - Fork 280
Open
Description
提交前必须检查以下项目 | The following items must be checked before submission
- 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
- 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions
问题类型 | Type of problem
模型推理和部署 | Model inference and deployment
操作系统 | Operating system
Linux
详细描述问题 | Detailed description of the problem
PORT=6006
model related
MODEL_NAME=qwen2
MODEL_PATH=qwen2-7B-Instruct
PROMPT_NAME=qwen2
own
MAX_NUM_SEQS=4096
CONTEXT_LEN = 4096
rag related
EMBEDDING_NAME=
RERANK_NAME=
api related
API_PREFIX=/v1
vllm related
ENGINE=vllm
TRUST_REMOTE_CODE=true
TOKENIZE_MODE=auto
TENSOR_PARALLEL_SIZE=1
DTYPE=auto
TASKS=llm
TASKS=llm,rag
上面是运行的配置文件,我尝试过使用transformers运行,也尝试过命令python -m vllm.entrypoints.openai.api_server --model qwen2-7B-Instruct --port 8080 --served-model-name qwen2运行,后面二者都不会产生截断问题,当使用该项目启动时,便会存在截断问题,大概生成六百字左右就开始截断,模型是我微调过的模型,主要任务是生成长文本。
Dependencies
vllm 0.4.3
运行日志或截图 | Runtime logs or screenshots
甲乙双方各持一份,具有
我截取了最后截断的不烦,“具有”两个字之后输出突然戛然而止
Metadata
Metadata
Assignees
Labels
No labels