WARNING 04-07 13:40:17 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:40:17 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:40:17 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. (APIServer pid=352) INFO 04-07 13:40:17 [utils.py:325] (APIServer pid=352) INFO 04-07 13:40:17 [utils.py:325] █ █ █▄ ▄█ (APIServer pid=352) INFO 04-07 13:40:17 [utils.py:325] ▄▄ ▄█ █ █ █ ▀▄▀ █ version 0.15.0 (APIServer pid=352) INFO 04-07 13:40:17 [utils.py:325] █▄█▀ █ █ █ █ model /data/metax-tech/Qwen3.5-397B-A17B-W8A8 (APIServer pid=352) INFO 04-07 13:40:17 [utils.py:325] ▀▀ ▀▀▀▀▀ ▀▀▀▀▀ ▀ ▀ (APIServer pid=352) INFO 04-07 13:40:17 [utils.py:325] (APIServer pid=352) INFO 04-07 13:40:17 [utils.py:261] non-default args: {'model_tag': '/data/metax-tech/Qwen3.5-397B-A17B-W8A8', 'api_server_count': 1, 'host': '0.0.0.0', 'enable_auto_tool_choice': True, 'tool_call_parser': 'qwen3_coder', 'model': '/data/metax-tech/Qwen3.5-397B-A17B-W8A8', 'max_model_len': 262144, 'served_model_name': ['Qwen3.5-W8A8'], 'reasoning_parser': 'qwen3', 'tensor_parallel_size': 8, 'speculative_config': {'method': 'qwen3_next_mtp', 'num_speculative_tokens': 2}} (APIServer pid=352) INFO 04-07 13:40:36 [model.py:541] Resolved architecture: Qwen3_5MoeForConditionalGeneration (APIServer pid=352) INFO 04-07 13:40:36 [model.py:1561] Using max model len 262144 (APIServer pid=352) WARNING 04-07 13:40:36 [speculative.py:270] method `qwen3_next_mtp` is deprecated and replaced with mtp. (APIServer pid=352) INFO 04-07 13:40:55 [model.py:541] Resolved architecture: Qwen3_5MoeMTP (APIServer pid=352) INFO 04-07 13:40:55 [model.py:1561] Using max model len 262144 (APIServer pid=352) WARNING 04-07 13:40:55 [speculative.py:388] Enabling num_speculative_tokens > 1 will runmultiple times of forward on same MTP layer,which may result in lower acceptance rate (APIServer pid=352) INFO 04-07 13:40:55 [scheduler.py:226] Chunked prefill is enabled with max_num_batched_tokens=2048. (APIServer pid=352) `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. (APIServer pid=352) [2026-04-07 13:40:55] INFO config.py:139: Setting attention block size to 512 tokens to ensure that attention page size is >= mamba page size. (APIServer pid=352) [2026-04-07 13:40:55] INFO config.py:170: Padding mamba page size by 88.93% to ensure that mamba page size and attention page size are exactly equal. (APIServer pid=352) /opt/conda/lib/python3.12/site-packages/compressed_tensors/quantization/quant_args.py:362: UserWarning: No observer is used for dynamic quant., setting to None (APIServer pid=352) warnings.warn( (APIServer pid=352) INFO 04-07 13:40:55 [vllm.py:624] Asynchronous scheduling is enabled. (APIServer pid=352) INFO 04-07 13:40:55 [envs.py:83] Plugin sets VLLM_TUNED_CONFIG_FOLDER to /opt/conda/lib/python3.12/site-packages/vllm_metax/model_executor/layers/fused_moe/configs/H=4096. Reason: set FusedMoE tuned config dir by hidden_size=4096 INFO 04-07 13:41:01 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:01 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:01 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:01 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:01 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:01 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:05 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:05 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:05 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:05 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:05 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:05 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . (EngineCore_DP0 pid=762) WARNING 04-07 13:41:09 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. (EngineCore_DP0 pid=762) WARNING 04-07 13:41:09 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. (EngineCore_DP0 pid=762) WARNING 04-07 13:41:09 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. (EngineCore_DP0 pid=762) WARNING 04-07 13:41:09 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. (EngineCore_DP0 pid=762) WARNING 04-07 13:41:09 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. (EngineCore_DP0 pid=762) INFO 04-07 13:41:09 [core.py:96] Initializing a V1 LLM engine (v0.15.0) with config: model='/data/metax-tech/Qwen3.5-397B-A17B-W8A8', speculative_config=SpeculativeConfig(method='mtp', model='/data/metax-tech/Qwen3.5-397B-A17B-W8A8', num_spec_tokens=2), tokenizer='/data/metax-tech/Qwen3.5-397B-A17B-W8A8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=262144, download_dir=None, load_format=auto, tensor_parallel_size=8, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=True, quantization=compressed-tensors, enforce_eager=False, enable_return_routed_experts=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='qwen3', reasoning_parser_plugin='', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None, kv_cache_metrics=False, kv_cache_metrics_sample=0.01, cudagraph_metrics=False, enable_layerwise_nvtx_tracing=False, enable_mfu_metrics=False, enable_mm_processor_stats=False, enable_logging_iteration_details=False), seed=0, served_model_name=Qwen3.5-W8A8, enable_prefix_caching=False, enable_chunked_prefill=True, pooler_config=None, compilation_config={'level': None, 'mode': , 'debug_dump_path': None, 'cache_dir': '', 'compile_cache_save_format': 'binary', 'backend': 'inductor', 'custom_ops': ['none'], 'splitting_ops': ['vllm::unified_attention', 'vllm::unified_attention_with_output', 'vllm::unified_mla_attention', 'vllm::unified_mla_attention_with_output', 'vllm::mamba_mixer2', 'vllm::mamba_mixer', 'vllm::short_conv', 'vllm::linear_attention', 'vllm::plamo2_mamba_mixer', 'vllm::gdn_attention_core', 'vllm::kda_attention', 'vllm::sparse_attn_indexer', 'vllm::rocm_aiter_sparse_attn_indexer', 'vllm::mx_sparse_attn_indexer'], 'compile_mm_encoder': False, 'compile_sizes': [], 'compile_ranges_split_points': [2048], 'inductor_compile_config': {'enable_auto_functionalized_v2': False}, 'inductor_passes': {}, 'cudagraph_mode': , 'cudagraph_num_of_warmups': 1, 'cudagraph_capture_sizes': [1, 2, 4, 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96, 104, 112, 120, 128, 136, 144, 152, 160, 168, 176, 184, 192, 200, 208, 216, 224, 232, 240, 248, 256, 272, 288, 304, 320, 336, 352, 368, 384, 400, 416, 432, 448, 464, 480, 496, 512], 'cudagraph_copy_inputs': False, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {'fuse_norm_quant': False, 'fuse_act_quant': False, 'fuse_attn_quant': False, 'eliminate_noops': True, 'enable_sp': False, 'fuse_gemm_comms': False, 'fuse_allreduce_rms': False}, 'max_cudagraph_capture_size': 512, 'dynamic_shapes_config': {'type': , 'evaluate_guards': False, 'assume_32_bit_indexing': True}, 'local_cache_dir': None} (EngineCore_DP0 pid=762) WARNING 04-07 13:41:09 [multiproc_executor.py:910] Reducing Torch parallelism from 96 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed. INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [__init__.py:43] Available plugins for group vllm.platform_plugins: INFO 04-07 13:41:11 [__init__.py:45] - metax -> vllm_metax:register INFO 04-07 13:41:11 [__init__.py:48] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load. INFO 04-07 13:41:11 [__init__.py:217] Platform plugin metax is activated INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_USE_FLASHINFER_SAMPLER to False. Reason: flashinfer sampler are not supported on maca INFO 04-07 13:41:11 [envs.py:83] Plugin sets VLLM_ENGINE_READY_TIMEOUT_S to 3600. Reason: set timeout to 3600s for model loading INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:15 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . INFO Print the version information of mcoplib during compilation. Version info:Mcoplib_Version = '0.4.0' Build_Maca_Version = '3.5.3.18' GIT_BRANCH = 'HEAD' GIT_COMMIT = 'fe3a7e2' Vllm Op Version = 0.15.0 SGlang Op Version = 0.5.7 && 0.5.8 INFO Staring Check the current MACA version of the operating environment. INFO: Release major.minor matching, successful:3.5. WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'awq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'compressed-tensors' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'gptq_marlin' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:16 [__init__.py:86] The quantization method 'moe_wna16' already exists and will be overwritten by the quantization config . WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepSeekMTPModel is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_mtp:DeepSeekMTP. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV2ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV2ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV3ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture DeepseekV32ForCausalLM is already registered, and will be overwritten by the new model class vllm_metax.models.deepseek_v2:DeepseekV3ForCausalLM. WARNING 04-07 13:41:20 [registry.py:812] Model architecture KimiK25ForConditionalGeneration is already registered, and will be overwritten by the new model class vllm_metax.models.kimi_k25:KimiK25ForConditionalGeneration. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. `Qwen2VLImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Qwen2VLImageProcessor` instead. INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=6 local_rank=6 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=2 local_rank=2 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=0 local_rank=0 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=7 local_rank=7 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=3 local_rank=3 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=5 local_rank=5 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=1 local_rank=1 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:22 [parallel_state.py:1212] world_size=8 rank=4 local_rank=4 distributed_init_method=tcp://127.0.0.1:38003 backend=nccl INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [mccl.py:26] Found mccl from library libmccl.so INFO 04-07 13:41:23 [pynccl.py:111] vLLM is using nccl==2.16.5 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 7 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 7, EP rank 7 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 5 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 5, EP rank 5 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 4 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 4, EP rank 4 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 6 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 6, EP rank 6 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 0 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 0, EP rank 0 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 2 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 2, EP rank 2 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 3 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 3, EP rank 3 INFO 04-07 13:41:32 [parallel_state.py:1423] rank 1 in world size 8 is assigned as DP rank 0, PP rank 0, PCP rank 0, TP rank 1, EP rank 1 WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. WARNING 04-07 13:41:32 [__init__.py:204] min_p, logit_bias, and min_tokens parameters won't currently work with speculative decoding enabled. ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) INFO 04-07 13:41:34 [multiproc_executor.py:730] Parent process exited, terminating worker ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) ERROR 04-07 13:41:34 [multiproc_executor.py:772] WorkerProc failed to start. ERROR 04-07 13:41:34 [multiproc_executor.py:772] Traceback (most recent call last): ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 743, in worker_main ERROR 04-07 13:41:34 [multiproc_executor.py:772] worker = WorkerProc(*args, **kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 569, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 326, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.worker.init_device() # type: ignore ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 262, in init_device ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.model_runner = GPUModelRunnerV1(self.vllm_config, self.device) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 647, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] MultiModalBudget(self.vllm_config, self.mm_registry) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/worker/utils.py", line 45, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_by_modality = mm_registry.get_max_tokens_per_item_by_modality( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/registry.py", line 177, in get_max_tokens_per_item_by_modality ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_tokens_per_item = processor.info.get_mm_max_tokens_per_item( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 817, in get_mm_max_tokens_per_item ERROR 04-07 13:41:34 [multiproc_executor.py:772] max_image_tokens = self.get_max_image_tokens() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 939, in get_max_image_tokens ERROR 04-07 13:41:34 [multiproc_executor.py:772] target_width, target_height = self.get_image_size_with_most_features() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_vl.py", line 916, in get_image_size_with_most_features ERROR 04-07 13:41:34 [multiproc_executor.py:772] image_processor = self.get_image_processor() ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 622, in get_image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.get_hf_processor(**kwargs).image_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_vl.py", line 615, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return self.ctx.get_hf_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/multimodal/processing/context.py", line 368, in get_hf_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_processor_from_config( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 251, in cached_processor_from_config ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cached_get_processor_without_dynamic_kwargs( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 210, in cached_get_processor_without_dynamic_kwargs ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cached_get_processor( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/vllm/transformers_utils/processor.py", line 128, in get_processor ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = processor_cls.from_pretrained( ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1422, in from_pretrained ERROR 04-07 13:41:34 [multiproc_executor.py:772] return cls.from_args_and_dict(args, processor_dict, **instantiation_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1189, in from_args_and_dict ERROR 04-07 13:41:34 [multiproc_executor.py:772] processor = cls(*args, **valid_kwargs) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/models/qwen3_vl/processing_qwen3_vl.py", line 60, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] super().__init__(image_processor, tokenizer, video_processor, chat_template=chat_template) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 620, in __init__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] self.check_argument_for_proper_class(attribute_name, arg) ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in check_argument_for_proper_class ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 698, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] proper_class = tuple(self.get_possibly_dynamic_module(n) for n in class_name if n is not None) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/processing_utils.py", line 1586, in get_possibly_dynamic_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] if hasattr(transformers_module, module_name): ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2207, in __getattr__ ERROR 04-07 13:41:34 [multiproc_executor.py:772] module = self._get_module(self._class_to_module[name]) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2441, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] raise e ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 2439, in _get_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return importlib.import_module("." + module_name, self.__name__) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] return _bootstrap._gcd_import(name[level:], package, level) ERROR 04-07 13:41:34 [multiproc_executor.py:772] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1387, in _gcd_import ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1360, in _find_and_load ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 1331, in _find_and_load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 935, in _load_unlocked ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 999, in exec_module ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "", line 488, in _call_with_frames_removed ERROR 04-07 13:41:34 [multiproc_executor.py:772] File "/opt/conda/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 42, in ERROR 04-07 13:41:34 [multiproc_executor.py:772] from mistral_common.protocol.instruct.request import ChatCompletionRequest, ReasoningEffort ERROR 04-07 13:41:34 [multiproc_executor.py:772] ImportError: cannot import name 'ReasoningEffort' from 'mistral_common.protocol.instruct.request' (/opt/conda/lib/python3.12/site-packages/mistral_common/protocol/instruct/request.py) [rank0]:[W407 13:41:35.499843956 ProcessGroupNCCL.cpp:1544] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] EngineCore failed to start. (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] Traceback (most recent call last): (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 937, in run_engine_core (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs) (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 691, in __init__ (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] super().__init__( (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 105, in __init__ (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] self.model_executor = executor_class(vllm_config) (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 97, in __init__ (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] super().__init__(vllm_config) (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 101, in __init__ (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] self._init_executor() (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 165, in _init_executor (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] self.workers = WorkerProc.wait_for_ready(unready_workers) (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 678, in wait_for_ready (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] raise e from None (EngineCore_DP0 pid=762) ERROR 04-07 13:41:39 [core.py:946] Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause. (EngineCore_DP0 pid=762) Process EngineCore_DP0: (EngineCore_DP0 pid=762) Traceback (most recent call last): (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap (EngineCore_DP0 pid=762) self.run() (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 108, in run (EngineCore_DP0 pid=762) self._target(*self._args, **self._kwargs) (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 950, in run_engine_core (EngineCore_DP0 pid=762) raise e (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 937, in run_engine_core (EngineCore_DP0 pid=762) engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs) (EngineCore_DP0 pid=762) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 691, in __init__ (EngineCore_DP0 pid=762) super().__init__( (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 105, in __init__ (EngineCore_DP0 pid=762) self.model_executor = executor_class(vllm_config) (EngineCore_DP0 pid=762) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 97, in __init__ (EngineCore_DP0 pid=762) super().__init__(vllm_config) (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 101, in __init__ (EngineCore_DP0 pid=762) self._init_executor() (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 165, in _init_executor (EngineCore_DP0 pid=762) self.workers = WorkerProc.wait_for_ready(unready_workers) (EngineCore_DP0 pid=762) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=762) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 678, in wait_for_ready (EngineCore_DP0 pid=762) raise e from None (EngineCore_DP0 pid=762) Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause. (APIServer pid=352) Traceback (most recent call last): (APIServer pid=352) File "/opt/conda/bin/vllm", line 8, in (APIServer pid=352) sys.exit(main()) (APIServer pid=352) ^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/cli/main.py", line 73, in main (APIServer pid=352) args.dispatch_function(args) (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/cli/serve.py", line 111, in cmd (APIServer pid=352) uvloop.run(run_server(args)) (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/uvloop/__init__.py", line 96, in run (APIServer pid=352) return __asyncio.run( (APIServer pid=352) ^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/asyncio/runners.py", line 195, in run (APIServer pid=352) return runner.run(main) (APIServer pid=352) ^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/asyncio/runners.py", line 118, in run (APIServer pid=352) return self._loop.run_until_complete(task) (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper (APIServer pid=352) return await main (APIServer pid=352) ^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 919, in run_server (APIServer pid=352) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs) (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 938, in run_server_worker (APIServer pid=352) async with build_async_engine_client( (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/contextlib.py", line 210, in __aenter__ (APIServer pid=352) return await anext(self.gen) (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 147, in build_async_engine_client (APIServer pid=352) async with build_async_engine_client_from_engine_args( (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/contextlib.py", line 210, in __aenter__ (APIServer pid=352) return await anext(self.gen) (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 188, in build_async_engine_client_from_engine_args (APIServer pid=352) async_llm = AsyncLLM.from_vllm_config( (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 228, in from_vllm_config (APIServer pid=352) return cls( (APIServer pid=352) ^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 155, in __init__ (APIServer pid=352) self.engine_core = EngineCoreClient.make_async_mp_client( (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 122, in make_async_mp_client (APIServer pid=352) return AsyncMPClient(*client_args) (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 819, in __init__ (APIServer pid=352) super().__init__( (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 479, in __init__ (APIServer pid=352) with launch_core_engines(vllm_config, executor_class, log_stats) as ( (APIServer pid=352) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=352) File "/opt/conda/lib/python3.12/contextlib.py", line 144, in __exit__ (APIServer pid=352) next(self.gen) (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 933, in launch_core_engines (APIServer pid=352) wait_for_engine_startup( (APIServer pid=352) File "/opt/conda/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 992, in wait_for_engine_startup (APIServer pid=352) raise RuntimeError( (APIServer pid=352) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {} /opt/conda/lib/python3.12/multiprocessing/resource_tracker.py:279: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. vllm 0.15.0 requires flashinfer-python==0.6.1, which is not installed. vllm 0.15.0 requires opencv-python-headless>=4.13.0, but you have opencv-python-headless 4.11.0.86 which is incompatible. vllm 0.15.0 requires torch==2.9.1, but you have torch 2.8.0+metax3.5.3.9 which is incompatible. vllm 0.15.0 requires torchaudio==2.9.1, but you have torchaudio 2.4.1+metax3.5.3.9 which is incompatible. vllm 0.15.0 requires torchvision==0.24.1, but you have torchvision 0.15.1+metax3.5.3.9 which is incompatible. vllm 0.15.0 requires transformers<5,>=4.56.0, but you have transformers 5.5.0 which is incompatible. vllm-metax 0.15.0+g24fb31.d20260310.maca3.5.3.20.torch2.8 requires transformers<5,>=4.56.0, but you have transformers 5.5.0 which is incompatible. Successfully installed hf-xet-1.4.3 huggingface-hub-1.9.0 transformers-5.5.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.^