Threads | OverS9982 | 沐曦开发者论坛

docker镜像：vllm-metax:0.15.0-maca.ai3.5.3.203-torch2.8-py312-ubuntu22.04-amd64
mx-smi：

mx-smi  version: 2.2.12

=================== MetaX System Management Interface Log ===================
Timestamp                                         : Tue Apr  7 13:23:44 2026

Attached GPUs                                     : 8
+---------------------------------------------------------------------------------+
| MX-SMI 2.2.12                      Kernel Mode Driver Version: 3.3.12           |
| MACA Version: 3.5.3.20             BIOS Version: 1.22.3.0                       |
|------------------+-----------------+---------------------+----------------------|
| Board       Name | GPU   Persist-M | Bus-id              | GPU-Util      sGPU-M |
| Pwr:Usage/Cap    | Temp       Perf | Memory-Usage        | GPU-State            |
|==================+=================+=====================+======================|
| 0     MetaX C550 | 0           N/A | 0000:2a:00.0        | 0%          Disabled |
| NA / NA          | 34C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+
| 1     MetaX C550 | 1           N/A | 0000:3a:00.0        | 0%          Disabled |
| NA / NA          | 38C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+
| 2     MetaX C550 | 2           N/A | 0000:4c:00.0        | 0%          Disabled |
| NA / NA          | 39C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+
| 3     MetaX C550 | 3           N/A | 0000:5c:00.0        | 0%          Disabled |
| NA / NA          | 37C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+
| 4     MetaX C550 | 4           N/A | 0000:aa:00.0        | 0%          Disabled |
| NA / NA          | 37C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+
| 5     MetaX C550 | 5           N/A | 0000:ba:00.0        | 0%          Disabled |
| NA / NA          | 39C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+
| 6     MetaX C550 | 6           N/A | 0000:ca:00.0        | 0%          Disabled |
| NA / NA          | 39C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+
| 7     MetaX C550 | 7           N/A | 0000:da:00.0        | 0%          Disabled |
| NA / NA          | 35C         N/A | 858/65536 MiB       | Available            |
+------------------+-----------------+---------------------+----------------------+

+---------------------------------------------------------------------------------+
| Process:                                                                        |
|  GPU                    PID         Process Name                 GPU Memory     |
|                                                                  Usage(MiB)     |
|=================================================================================|
|  no process found                                                               |
+---------------------------------------------------------------------------------+

End of Log

vllm 运行命令：

nohup vllm serve /data/metax-tech/Qwen3.5-397B-A17B-W8A8 \
  --host 0.0.0.0 \
  --port 8000 \
  --tensor-parallel-size 8 \
  --gpu-memory-utilization 0.9 \
  --max-model-len 262144 \
  --reasoning-parser qwen3 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder \
  --served-model-name Qwen3.5-W8A8\
  --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}' &

报错信息：

root@h20gpu-20:/data/metax-tech/Qwen3.5-397B-A17B-W8A8# tail -f nohup.out
(APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 12
(APIServer pid=32)     return ModelConfig(
(APIServer pid=32)            ^^^^^^^^^^^^
(APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/pydantic/_internal/_dataclasses.py
(APIServer pid=32)     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instan
(APIServer pid=32) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
(APIServer pid=32)   Value error, The checkpoint you are trying to load has model type `qwen3_5_moe` int, or because your version of Transformers is out of date.
(APIServer pid=32)
(APIServer pid=32) You can update Transformers with the command `pip install --upgrade transformers`.orts this model yet. In this case, you can get the most up-to-date code by installing Transformers frerror, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs]
(APIServer pid=32)     For further information visit https://errors.pydantic.dev/2.12/v/value_error
(APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 12
(APIServer pid=32)     return ModelConfig(
(APIServer pid=32)            ^^^^^^^^^^^^
(APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/pydantic/_internal/_dataclasses.py
(APIServer pid=32)     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instan
(APIServer pid=32) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
(APIServer pid=32)   Value error, The checkpoint you are trying to load has model type `qwen3_5_moe` int, or because your version of Transformers is out of date.
(APIServer pid=32)
(APIServer pid=32) You can update Transformers with the command `pip install --upgrade transformers`.orts this model yet. In this case, you can get the most up-to-date code by installing Transformers frerror, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs]
(APIServer pid=32)     For further information visit https://errors.pydantic.dev/2.12/v/value_error

是要升级 transformers 吗？