MetaX-Tech Developer Forum 论坛首页
  • 沐曦开发者
search
Sign in

OverS9982

  • Members
  • Joined 2025年12月30日
  • message 帖子
  • forum 主题
  • favorite 关注者
  • favorite_border Follows
  • person_outline 详细信息

OverS9982 has posted 8 messages.

  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 17:40

    sglang预计什么时候会更新呀?

  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 17:35

    vllm和sglang模型权重文件应该是通用的吧。

  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 17:06

    这次我换sglang尝试跑Qwen3.5
    docker 镜像:sglang:0.5.8-maca.ai3.5.3.104-torch2.8-py312-ubuntu22.04-amd64
    sglang启动命令

    nohup python -m sglang.launch_server \
      --model-path /software/metax-tech/Qwen3.5-397B-A17B-W8A8 \
      --host 0.0.0.0 \
      --port 8000 \
      --tp 8 \
      --mem-fraction-static 0.9 \
      --context-length 262144 \
      --reasoning-parser qwen3 \
      --tool-call-parser qwen3_coder \
      --served-model-name Qwen3.5-W8A8 \
      --api-key 20260407 \
      --speculative-algo NEXTN \
      --speculative-num-steps 3 \
      --speculative-eagle-topk 1 \
      --speculative-num-draft-tokens 4 &
    

    启动后,报错内容如下:

    ValueError: The checkpoint you are trying to load has model type `qwen3_5_moe` but Transformers does not recognize this architecture. This could be because of an issue with the chour version of Transformers is out of date.
    
    You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version tl yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
    

    根据vllm处理经验,把transformers升级到5.2.0,sglang没有正常启动
    日志内容如下:

    Version info:Mcoplib_Version = '0.4.0'
    Build_Maca_Version = '3.5.3.18'
    GIT_BRANCH = 'HEAD'
    GIT_COMMIT = '3cd1a1a'
    Vllm Op Version = 0.14.0
    SGlang Op Version  = 0.5.7 && 0.5.8
    
    INFO Staring Check the current MACA version of the operating environment.
    
    INFO: Release major.minor matching,  successful:3.5.
    
    Skipping import of cpp extensions due to incompatible torch version 2.8.0+metax3.5.3.9 for torchao version 0.15.0             Please see https://github.com/pytorch/ao/issues/2919 for more info
    /opt/conda/lib/python3.12/site-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
      warnings.warn(_BETA_TRANSFORMS_WARNING)
    /opt/conda/lib/python3.12/site-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
      warnings.warn(_BETA_TRANSFORMS_WARNING)
    Traceback (most recent call last):
      File "<frozen runpy>", line 198, in _run_module_as_main
      File "<frozen runpy>", line 88, in _run_code
      File "/opt/conda/lib/python3.12/site-packages/sglang/launch_server.py", line 7, in <module>
        from sglang.srt.server_args import prepare_server_args
      File "/opt/conda/lib/python3.12/site-packages/sglang/srt/server_args.py", line 67, in <module>
        from sglang.srt.utils.hf_transformers_utils import check_gguf_file
      File "/opt/conda/lib/python3.12/site-packages/sglang/srt/utils/hf_transformers_utils.py", line 46, in <module>
        from sglang.srt.configs import (
      File "/opt/conda/lib/python3.12/site-packages/sglang/srt/configs/__init__.py", line 9, in <module>
        from sglang.srt.configs.janus_pro import MultiModalityConfig
      File "/opt/conda/lib/python3.12/site-packages/sglang/srt/configs/janus_pro.py", line 634, in <module>
        register_image_processor(MultiModalityConfig, VLMImageProcessor)
      File "/opt/conda/lib/python3.12/site-packages/sglang/srt/configs/utils.py", line 18, in register_image_processor
        AutoImageProcessor.register(config, None, image_processor, None, exist_ok=True)
    TypeError: AutoImageProcessor.register() got multiple values for argument 'exist_ok'
    
  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 16:18

    使用的是贵公司在魔搭社区上发布的Qwen3.5-397B-A17B-W8A8 量化权重
    URL:modelscope.cn/models/metax-tech/Qwen3.5-397B-A17B-W8A8

  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 16:11

    你好,刚刚启动成功了,但是一运行就报错,报错信息如下。

    (APIServer pid=279037) INFO:     10.100.20.4:51028 - "GET /v1/models HTTP/1.1" 200 OK
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] WorkerProc hit an exception.
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] WorkerProc hit an exception.
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 365, in execute_model
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.worker.execute_model(scheduler_output)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 630, in execute_model
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = self.model_runner.execute_model(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3333, in execute_model
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     logits_indices, spec_decode_metadata = self._prepare_inputs(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                                            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1557, in _prepare_inputs
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._prepare_input_ids(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1283, in _prepare_input_ids
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     assert prev_req_id_to_index is not None
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AssertionError
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 365, in execute_model
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.worker.execute_model(scheduler_output)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 630, in execute_model
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = self.model_runner.execute_model(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3333, in execute_model
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     logits_indices, spec_decode_metadata = self._prepare_inputs(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                                            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1557, in _prepare_inputs
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._prepare_input_ids(
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1283, in _prepare_input_ids
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     assert prev_req_id_to_index is not None
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AssertionError
    (Worker_TP2 pid=279587) ERROR 04-07 16:04:24 [multiproc_executor.py:852]
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852] WorkerProc hit an exception.
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP0 pid=279585) ERROR 04-07 16:04:24 [multiproc_executor.py:852]
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852] WorkerProc hit an exception.
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP7 pid=279592) ERROR 04-07 16:04:24 [multiproc_executor.py:852]
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [dump_input.py:72] Dumping input data for V1 LLM engine (v0.15.0) with config: model='/data/metax-tech/Qwen3.5-397B-A17B-W8A8', speculative_config=SpeculativeConfig(method='mtp', model='/data/metax-tech/Qwen3.5-397B-A17B-W8A8', num_spec_tokens=2), tokenizer='/data/metax-tech/Qwen3.5-397B-A17B-W8A8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=262144, download_dir=None, load_format=auto, tensor_parallel_size=8, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=True, quantization=compressed-tensors, enforce_eager=False, enable_return_routed_experts=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='qwen3', reasoning_parser_plugin='', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None, kv_cache_metrics=False, kv_cache_metrics_sample=0.01, cudagraph_metrics=False, enable_layerwise_nvtx_tracing=False, enable_mfu_metrics=False, enable_mm_processor_stats=False, enable_logging_iteration_details=False), seed=0, served_model_name=Qwen3.5-W8A8, enable_prefix_caching=False, enable_chunked_prefill=True, pooler_config=None, compilation_config={'level': None, 'mode': <CompilationMode.VLLM_COMPILE: 3>, 'debug_dump_path': None, 'cache_dir': '', 'compile_cache_save_format': 'binary', 'backend': 'inductor', 'custom_ops': ['none'], 'splitting_ops': ['vllm::unified_attention', 'vllm::unified_attention_with_output', 'vllm::unified_mla_attention', 'vllm::unified_mla_attention_with_output', 'vllm::mamba_mixer2', 'vllm::mamba_mixer', 'vllm::short_conv', 'vllm::linear_attention', 'vllm::plamo2_mamba_mixer', 'vllm::gdn_attention_core', 'vllm::kda_attention', 'vllm::sparse_attn_indexer', 'vllm::rocm_aiter_sparse_attn_indexer', 'vllm::mx_sparse_attn_indexer'], 'compile_mm_encoder': False, 'compile_sizes': [], 'compile_ranges_split_points': [2048], 'inductor_compile_config': {'enable_auto_functionalized_v2': False}, 'inductor_passes': {}, 'cudagraph_mode': <CUDAGraphMode.FULL_AND_PIECEWISE: (2, 1)>, 'cudagraph_num_of_warmups': 1, 'cudagraph_capture_sizes': [1, 2, 4, 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96, 104, 112, 120, 128, 136, 144, 152, 160, 168, 176, 184, 192, 200, 208, 216, 224, 232, 240, 248, 256, 272, 288, 304, 320, 336, 352, 368, 384, 400, 416, 432, 448, 464, 480, 496, 512], 'cudagraph_copy_inputs': False, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {'fuse_norm_quant': False, 'fuse_act_quant': False, 'fuse_attn_quant': False, 'eliminate_noops': True, 'enable_sp': False, 'fuse_gemm_comms': False, 'fuse_allreduce_rms': False}, 'max_cudagraph_capture_size': 512, 'dynamic_shapes_config': {'type': <DynamicShapesType.BACKED: 'backed'>, 'evaluate_guards': False, 'assume_32_bit_indexing': True}, 'local_cache_dir': None},
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [dump_input.py:79] Dumping scheduler output for model execution: SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=chatcmpl-acfd252e27f29942-bb0bd1ec,prompt_token_ids_len=11,prefill_token_ids_len=None,mm_features=[],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, seed=None, stop=[], stop_token_ids=[248044], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=5, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([1, 2, 3], [4, 5, 6], [7, 8, 9], [10]),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None)], scheduled_cached_reqs=CachedRequestData(req_ids=[],resumed_req_ids=set(),new_token_ids_lens=[],all_token_ids_lens={},new_block_ids=[],num_computed_tokens=[],num_output_tokens=[]), num_scheduled_tokens={chatcmpl-acfd252e27f29942-bb0bd1ec: 11}, total_num_scheduled_tokens=11, scheduled_spec_decode_tokens={}, scheduled_encoder_inputs={}, num_common_prefix_blocks=[0, 0, 0, 0], finished_req_ids=[], free_encoder_mm_hashes=[], preempted_req_ids=[], has_structured_output_requests=false, pending_structured_output_tokens=false, num_invalid_spec_tokens=null, kv_connector_metadata=null, ec_connector_metadata=null)
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [dump_input.py:81] Dumping scheduler stats: SchedulerStats(num_running_reqs=1, num_waiting_reqs=0, step_counter=0, current_wave=0, kv_cache_usage=0.019379844961240345, prefix_cache_stats=PrefixCacheStats(reset=False, requests=0, queries=0, hits=0, preempted_requests=0, preempted_queries=0, preempted_hits=0), connector_prefix_cache_stats=None, kv_cache_eviction_events=[], spec_decoding_stats=None, kv_connector_stats=None, waiting_lora_adapters={}, running_lora_adapters={}, cudagraph_stats=None, perf_stats=None)
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948] EngineCore encountered a fatal error.
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948] Traceback (most recent call last):
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 939, in run_engine_core
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     engine_core.run_busy_loop()
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 966, in run_busy_loop
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     self._process_engine_step()
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 999, in _process_engine_step
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     outputs, model_executed = self.step_fn()
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]                               ^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 486, in step_with_batch_queue
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     model_output = future.result()
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]                    ^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 80, in result
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     return super().result()
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]            ^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     return self.__get_result()
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]            ^^^^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     raise self._exception
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 84, in wait_for_response
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     response = self.aggregate(get_response())
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]                               ^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 357, in get_response
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948]     raise RuntimeError(
    (EngineCore_DP0 pid=279447) ERROR 04-07 16:04:24 [core.py:948] RuntimeError: Worker failed with error ''EagleProposer' object has no attribute 'positions'', please check the stack trace above for the root cause
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852] WorkerProc hit an exception.
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852] Traceback (most recent call last):
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 847, in worker_busy_loop
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     output = func(*args, **kwargs)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]              ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 579, in sample_tokens
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return self.model_runner.sample_tokens(grammar_output)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     return func(*args, **kwargs)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]            ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3656, in sample_tokens
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     propose_draft_token_ids(sampled_token_ids)
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3627, in propose_draft_token_ids
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     self._draft_token_ids = self.propose_draft_token_ids(
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 3990, in propose_draft_token_ids
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     draft_token_ids = self.drafter.propose(
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                       ^^^^^^^^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/spec_decode/eagle.py", line 406, in propose
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]     positions = self.positions[:, last_token_indices]
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]                 ^^^^^^^^^^^^^^
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852] AttributeError: 'EagleProposer' object has no attribute 'positions'
    (Worker_TP6 pid=279591) ERROR 04-07 16:04:24 [multiproc_executor.py:852]
    (Worker_TP2 pid=279587) INFO 04-07 16:04:24 [multiproc_executor.py:730] Parent process exited, terminating worker
    (Worker_TP6 pid=279591) INFO 04-07 16:04:24 [multiproc_executor.py:730] Parent process exited, terminating worker
    (Worker_TP2 pid=279587) INFO 04-07 16:04:24 [multiproc_executor.py:774] WorkerProc shutting down.
    

    详细日志见附件

  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 14:57

    非常感谢,已经启动成功了。

  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 14:14

    按报错提示执行

    pip install --upgrade transformers
    

    有报错信息:

    ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
    vllm 0.15.0 requires flashinfer-python==0.6.1, which is not installed.
    vllm 0.15.0 requires opencv-python-headless>=4.13.0, but you have opencv-python-headless 4.11.0.86 which is incompatible.
    vllm 0.15.0 requires torch==2.9.1, but you have torch 2.8.0+metax3.5.3.9 which is incompatible.
    vllm 0.15.0 requires torchaudio==2.9.1, but you have torchaudio 2.4.1+metax3.5.3.9 which is incompatible.
    vllm 0.15.0 requires torchvision==0.24.1, but you have torchvision 0.15.1+metax3.5.3.9 which is incompatible.
    vllm 0.15.0 requires transformers<5,>=4.56.0, but you have transformers 5.5.0 which is incompatible.
    vllm-metax 0.15.0+g24fb31.d20260310.maca3.5.3.20.torch2.8 requires transformers<5,>=4.56.0, but you have transformers 5.5.0 which is incompatible.
    Successfully installed hf-xet-1.4.3 huggingface-hub-1.9.0 transformers-5.5.0
    WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
    

    忽略error执行 vllm serve后
    产生的主要报错内容:

    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start.
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     worker = WorkerProc(*args, **kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()  # type: ignore
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     ^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     MultiModalBudget(self.vllm_config, self.mm_registry)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_per_item = processor.info.get_mm_max_tokens_per_item(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_image_tokens = self.get_max_image_tokens()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     target_width, target_height = self.get_image_size_with_most_features()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     image_processor = self.get_image_processor()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.get_hf_processor(**kwargs).image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.ctx.get_hf_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_processor_from_config(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_get_processor_without_dynamic_kwargs(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cached_get_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = processor_cls.from_pretrained(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cls(*args, **valid_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.check_argument_for_proper_class(attribute_name, arg)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in <genexpr>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     if hasattr(transformers_module, module_name):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     module = self._get_module(self._class_to_module[name])
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     raise e
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return importlib.import_module("." + module_name, self.__name__)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return _bootstrap._gcd_import(name[level:], package, level)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap_external>", line 999, in exec_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in <module>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py)
    INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start.
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     worker = WorkerProc(*args, **kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()  # type: ignore
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     ^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     MultiModalBudget(self.vllm_config, self.mm_registry)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_per_item = processor.info.get_mm_max_tokens_per_item(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_image_tokens = self.get_max_image_tokens()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     target_width, target_height = self.get_image_size_with_most_features()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     image_processor = self.get_image_processor()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.get_hf_processor(**kwargs).image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.ctx.get_hf_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_processor_from_config(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_get_processor_without_dynamic_kwargs(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cached_get_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = processor_cls.from_pretrained(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cls(*args, **valid_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.check_argument_for_proper_class(attribute_name, arg)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in <genexpr>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     if hasattr(transformers_module, module_name):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     module = self._get_module(self._class_to_module[name])
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     raise e
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return importlib.import_module("." + module_name, self.__name__)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return _bootstrap._gcd_import(name[level:], package, level)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap_external>", line 999, in exec_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in <module>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start.
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     worker = WorkerProc(*args, **kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()  # type: ignore
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     ^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     MultiModalBudget(self.vllm_config, self.mm_registry)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_per_item = processor.info.get_mm_max_tokens_per_item(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_image_tokens = self.get_max_image_tokens()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     target_width, target_height = self.get_image_size_with_most_features()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     image_processor = self.get_image_processor()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.get_hf_processor(**kwargs).image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.ctx.get_hf_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_processor_from_config(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_get_processor_without_dynamic_kwargs(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cached_get_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = processor_cls.from_pretrained(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cls(*args, **valid_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.check_argument_for_proper_class(attribute_name, arg)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in <genexpr>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     if hasattr(transformers_module, module_name):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     module = self._get_module(self._class_to_module[name])
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     raise e
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return importlib.import_module("." + module_name, self.__name__)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return _bootstrap._gcd_import(name[level:], package, level)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap_external>", line 999, in exec_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in <module>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start.
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     worker = WorkerProc(*args, **kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.worker.init_device()  # type: ignore
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     ^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     MultiModalBudget(self.vllm_config, self.mm_registry)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_tokens_per_item = processor.info.get_mm_max_tokens_per_item(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     max_image_tokens = self.get_max_image_tokens()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     target_width, target_height = self.get_image_size_with_most_features()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     image_processor = self.get_image_processor()
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.get_hf_processor(**kwargs).image_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return self.ctx.get_hf_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_processor_from_config(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cached_get_processor_without_dynamic_kwargs(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cached_get_processor(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = processor_cls.from_pretrained(
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     processor = cls(*args, **valid_kwargs)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     self.check_argument_for_proper_class(attribute_name, arg)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in <genexpr>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     if hasattr(transformers_module, module_name):
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     module = self._get_module(self._class_to_module[name])
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     raise e
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return importlib.import_module("." + module_name, self.__name__)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     return _bootstrap._gcd_import(name[level:], package, level)
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap_external>", line 999, in exec_module
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]   File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in <module>
    ERROR 04-07 13:41:34 [multiproc_executor.py:772]     from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort
    ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py)
    [rank0]:[W407 13:41:35.499843956 ProcessGroupNCCL.cpp:1544] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] EngineCore failed to start.
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] Traceback (most recent call last):
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 937, in run_engine_core
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 691, in __init__
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]     super().__init__(
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 105, in __init__
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]     self.model_executor = executor_class(vllm_config)
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 97, in __init__
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]     super().__init__(vllm_config)
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 101, in __init__
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]     self._init_executor()
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 165, in _init_executor
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]     self.workers = WorkerProc.wait_for_ready(unready_workers)
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 678, in wait_for_ready
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946]     raise e from None
    (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
    (EngineCore_DP0 pid=762) Process EngineCore_DP0:
    (EngineCore_DP0 pid=762) Traceback (most recent call last):
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    (EngineCore_DP0 pid=762)     self.run()
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 108, in run
    (EngineCore_DP0 pid=762)     self._target(*self._args, **self._kwargs)
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 950, in run_engine_core
    (EngineCore_DP0 pid=762)     raise e
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 937, in run_engine_core
    (EngineCore_DP0 pid=762)     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
    (EngineCore_DP0 pid=762)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 691, in __init__
    (EngineCore_DP0 pid=762)     super().__init__(
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 105, in __init__
    (EngineCore_DP0 pid=762)     self.model_executor = executor_class(vllm_config)
    (EngineCore_DP0 pid=762)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 97, in __init__
    (EngineCore_DP0 pid=762)     super().__init__(vllm_config)
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 101, in __init__
    (EngineCore_DP0 pid=762)     self._init_executor()
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 165, in _init_executor
    (EngineCore_DP0 pid=762)     self.workers = WorkerProc.wait_for_ready(unready_workers)
    (EngineCore_DP0 pid=762)                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (EngineCore_DP0 pid=762)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 678, in wait_for_ready
    (EngineCore_DP0 pid=762)     raise e from None
    (EngineCore_DP0 pid=762) Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
    (APIServer pid=352) Traceback (most recent call last):
    (APIServer pid=352)   File "/opt/conda/bin/vllm", line 8, in <module>
    (APIServer pid=352)     sys.exit(main())
    (APIServer pid=352)              ^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/cli/main.py", line 73, in main
    (APIServer pid=352)     args.dispatch_function(args)
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/cli/serve.py", line 111, in cmd
    (APIServer pid=352)     uvloop.run(run_server(args))
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/uvloop/__init__.py", line 96, in run
    (APIServer pid=352)     return __asyncio.run(
    (APIServer pid=352)            ^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/asyncio/runners.py", line 195, in run
    (APIServer pid=352)     return runner.run(main)
    (APIServer pid=352)            ^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/asyncio/runners.py", line 118, in run
    (APIServer pid=352)     return self._loop.run_until_complete(task)
    (APIServer pid=352)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
    (APIServer pid=352)     return await main
    (APIServer pid=352)            ^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 919, in run_server
    (APIServer pid=352)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 938, in run_server_worker
    (APIServer pid=352)     async with build_async_engine_client(
    (APIServer pid=352)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/contextlib.py", line 210, in __aenter__
    (APIServer pid=352)     return await anext(self.gen)
    (APIServer pid=352)            ^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 147, in build_async_engine_client
    (APIServer pid=352)     async with build_async_engine_client_from_engine_args(
    (APIServer pid=352)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/contextlib.py", line 210, in __aenter__
    (APIServer pid=352)     return await anext(self.gen)
    (APIServer pid=352)            ^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 188, in build_async_engine_client_from_engine_args
    (APIServer pid=352)     async_llm = AsyncLLM.from_vllm_config(
    (APIServer pid=352)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 228, in from_vllm_config
    (APIServer pid=352)     return cls(
    (APIServer pid=352)            ^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 155, in __init__
    (APIServer pid=352)     self.engine_core = EngineCoreClient.make_async_mp_client(
    (APIServer pid=352)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 122, in make_async_mp_client
    (APIServer pid=352)     return AsyncMPClient(*client_args)
    (APIServer pid=352)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 819, in __init__
    (APIServer pid=352)     super().__init__(
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 479, in __init__
    (APIServer pid=352)     with launch_core_engines(vllm_config, executor_class, log_stats) as (
    (APIServer pid=352)          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/contextlib.py", line 144, in __exit__
    (APIServer pid=352)     next(self.gen)
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 933, in launch_core_engines
    (APIServer pid=352)     wait_for_engine_startup(
    (APIServer pid=352)   File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 992, in wait_for_engine_startup
    (APIServer pid=352)     raise RuntimeError(
    (APIServer pid=352) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
    /opt/conda/lib/python3.12/multiprocessing/resource_tracker.py:279: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
      warnings.warn('resource_tracker: There appear to be %d '
    ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
    vllm 0.15.0 requires flashinfer-python==0.6.1, which is not installed.
    vllm 0.15.0 requires opencv-python-headless>=4.13.0, but you have opencv-python-headless 4.11.0.86 which is incompatible.
    vllm 0.15.0 requires torch==2.9.1, but you have torch 2.8.0+metax3.5.3.9 which is incompatible.
    vllm 0.15.0 requires torchaudio==2.9.1, but you have torchaudio 2.4.1+metax3.5.3.9 which is incompatible.
    vllm 0.15.0 requires torchvision==0.24.1, but you have torchvision 0.15.1+metax3.5.3.9 which is incompatible.
    vllm 0.15.0 requires transformers<5,>=4.56.0, but you have transformers 5.5.0 which is incompatible.
    vllm-metax 0.15.0+g24fb31.d20260310.maca3.5.3.20.torch2.8 requires transformers<5,>=4.56.0, but you have transformers 5.5.0 which is incompatible.
    Successfully installed hf-xet-1.4.3 huggingface-hub-1.9.0 transformers-5.5.0
    WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.^
    

    目前还是启动失败的。详细日志见附件。

  • See post chevron_right
    OverS9982
    Members
    Qwen3.5-397B-A17B-W8A8运行报错 已解决 2026年4月7日 13:28

    docker镜像:vllm-metax:0.15.0-maca.ai3.5.3.203-torch2.8-py312-ubuntu22.04-amd64
    mx-smi:

    mx-smi  version: 2.2.12
    
    =================== MetaX System Management Interface Log ===================
    Timestamp                                         : Tue Apr  7 13:23:44 2026
    
    Attached GPUs                                     : 8
    +---------------------------------------------------------------------------------+
    | MX-SMI 2.2.12                      Kernel Mode Driver Version: 3.3.12           |
    | MACA Version: 3.5.3.20             BIOS Version: 1.22.3.0                       |
    |------------------+-----------------+---------------------+----------------------|
    | Board       Name | GPU   Persist-M | Bus-id              | GPU-Util      sGPU-M |
    | Pwr:Usage/Cap    | Temp       Perf | Memory-Usage        | GPU-State            |
    |==================+=================+=====================+======================|
    | 0     MetaX C550 | 0           N/A | 0000:2a:00.0        | 0%          Disabled |
    | NA / NA          | 34C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    | 1     MetaX C550 | 1           N/A | 0000:3a:00.0        | 0%          Disabled |
    | NA / NA          | 38C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    | 2     MetaX C550 | 2           N/A | 0000:4c:00.0        | 0%          Disabled |
    | NA / NA          | 39C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    | 3     MetaX C550 | 3           N/A | 0000:5c:00.0        | 0%          Disabled |
    | NA / NA          | 37C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    | 4     MetaX C550 | 4           N/A | 0000:aa:00.0        | 0%          Disabled |
    | NA / NA          | 37C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    | 5     MetaX C550 | 5           N/A | 0000:ba:00.0        | 0%          Disabled |
    | NA / NA          | 39C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    | 6     MetaX C550 | 6           N/A | 0000:ca:00.0        | 0%          Disabled |
    | NA / NA          | 39C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    | 7     MetaX C550 | 7           N/A | 0000:da:00.0        | 0%          Disabled |
    | NA / NA          | 35C         N/A | 858/65536 MiB       | Available            |
    +------------------+-----------------+---------------------+----------------------+
    
    +---------------------------------------------------------------------------------+
    | Process:                                                                        |
    |  GPU                    PID         Process Name                 GPU Memory     |
    |                                                                  Usage(MiB)     |
    |=================================================================================|
    |  no process found                                                               |
    +---------------------------------------------------------------------------------+
    
    End of Log
    

    vllm 运行命令:

    nohup vllm serve /data/metax-tech/Qwen3.5-397B-A17B-W8A8 \
      --host 0.0.0.0 \
      --port 8000 \
      --tensor-parallel-size 8 \
      --gpu-memory-utilization 0.9 \
      --max-model-len 262144 \
      --reasoning-parser qwen3 \
      --enable-auto-tool-choice \
      --tool-call-parser qwen3_coder \
      --served-model-name Qwen3.5-W8A8\
      --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}' &
    

    报错信息:

    root@h20gpu-20:/data/metax-tech/Qwen3.5-397B-A17B-W8A8# tail -f nohup.out
    (APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 12
    (APIServer pid=32)     return ModelConfig(
    (APIServer pid=32)            ^^^^^^^^^^^^
    (APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/pydantic/_internal/_dataclasses.py
    (APIServer pid=32)     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instan
    (APIServer pid=32) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
    (APIServer pid=32)   Value error, The checkpoint you are trying to load has model type `qwen3_5_moe` int, or because your version of Transformers is out of date.
    (APIServer pid=32)
    (APIServer pid=32) You can update Transformers with the command `pip install --upgrade transformers`.orts this model yet. In this case, you can get the most up-to-date code by installing Transformers frerror, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs]
    (APIServer pid=32)     For further information visit https://errors.pydantic.dev/2.12/v/value_error
    (APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 12
    (APIServer pid=32)     return ModelConfig(
    (APIServer pid=32)            ^^^^^^^^^^^^
    (APIServer pid=32)   File "/opt/conda/lib/python3.12/site-packages/pydantic/_internal/_dataclasses.py
    (APIServer pid=32)     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instan
    (APIServer pid=32) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
    (APIServer pid=32)   Value error, The checkpoint you are trying to load has model type `qwen3_5_moe` int, or because your version of Transformers is out of date.
    (APIServer pid=32)
    (APIServer pid=32) You can update Transformers with the command `pip install --upgrade transformers`.orts this model yet. In this case, you can get the most up-to-date code by installing Transformers frerror, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs]
    (APIServer pid=32)     For further information visit https://errors.pydantic.dev/2.12/v/value_error
    

    是要升级 transformers 吗?

  • 沐曦开发者论坛
powered by misago