一、软硬件信息
1.服务器厂家:浪潮
2.沐曦GPU型号:METAX_C500_64G *4
3.操作系统内核版本:4.19.90-89.11.v2401.ky10.x86_64
4.是否开启CPU虚拟化:否
5.mx-smi回显:
mx-smi
mx-smi version: 2.2.9
=================== MetaX System Management Interface Log ===================
Timestamp : Thu Oct 30 10:02:21 2025
Attached GPUs : 4
+---------------------------------------------------------------------------------+
| MX-SMI 2.2.9 Kernel Mode Driver Version: 3.3.12 |
| MACA Version: unknown BIOS Version: 1.29.1.0 |
|------------------+-----------------+---------------------+----------------------|
| Board Name | GPU Persist-M | Bus-id | GPU-Util sGPU-M |
| Pwr:Usage/Cap | Temp Perf | Memory-Usage | GPU-State |
|==================+=================+=====================+======================|
| 0 MetaX C500 | 0 Off | 0000:43:00.0 | 0% Disabled |
| 54W / 350W | 31C P0 | 858/65536 MiB | Available |
+------------------+-----------------+---------------------+----------------------+
| 1 MetaX C500 | 1 Off | 0000:44:00.0 | 0% Disabled |
| 55W / 350W | 31C P0 | 858/65536 MiB | Available |
+------------------+-----------------+---------------------+----------------------+
| 2 MetaX C500 | 2 Off | 0000:45:00.0 | 0% Disabled |
| 60W / 350W | 33C P0 | 858/65536 MiB | Available |
+------------------+-----------------+---------------------+----------------------+
| 3 MetaX C500 | 3 Off | 0000:47:00.0 | 0% Disabled |
| 57W / 350W | 33C P0 | 858/65536 MiB | Available |
+------------------+-----------------+---------------------+----------------------+
+---------------------------------------------------------------------------------+
| Process: |
| GPU PID Process Name GPU Memory |
| Usage(MiB) |
|=================================================================================|
| no process found |
+---------------------------------------------------------------------------------+
End of Log
6.docker info回显:
docker info
Client:
Version: 28.3.3
Context: default
Debug Mode: false
Server:
Containers: 2
Running: 2
Paused: 0
Stopped: 0
Images: 6
Server Version: 28.3.3
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
CDI spec directories:
/etc/cdi
/var/run/cdi
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
runc version: v1.2.6-0-ge89a299
init version: de40ad0
Security Options:
seccomp
Profile: builtin
Kernel Version: 4.19.90-89.11.v2401.ky10.x86_64
Operating System: Kylin Linux Advanced Server V10 (Halberd)
OSType: linux
Architecture: x86_64
CPUs: 128
Total Memory: 994.7GiB
Name: localhost.localdomain
ID: ca3e0563-e7fc-4f52-ad88-9655d1100756
Docker Root Dir: /data/docker
7.镜像版本:
cr.metax-tech.com/public-library/maca-pytorch:3.2.1.4-torch2.6-py310-ubuntu24.04-amd64
cr.metax-tech.com/public-ai-release/maca/vllm:maca.ai3.1.0.7-torch2.6-py310-ubuntu22.04-amd64
cr.metax-tech.com/public-ai-release/maca/modelzoo.llm.vllm:maca.ai2.33.1.12-torch2.6-py310-ubuntu22.04-amd64
cr.metax-tech.com/public-ai-release/maca/vllm:maca.ai2.33.1.12-torch2.6-py310-ubuntu22.04-amd64
8.启动容器命令:
docker run -it --device=/dev/dri --device=/dev/mxcd --group-add video --name images --device=/dev/mem --network=host --security-opt seccomp=unconfined --security-opt apparmor=unconfined --shm-size '100gb' --ulimit memlock=-1 -v /usr/local/:/usr/local/ -v /data/models/:/data/models/ ce3f69501a52 /bin/bash
9.容器内执行命令:
vllm serve /data/models/Qwen/Qwen3-VL-30B-A3B-Instruct --served-model-name Qwen3-VL-30B --tensor-parallel-size 4 --swap-space 16 --trust-remote-code --dtype bfloat16 --gpu-memory-utilization 0.9 --max-model-len 30720 --port 18091
二、问题现象
服务器是4张64G C500沐曦显卡,部署MiniCPM-V-4_5 、Qwen3-VL-30B-A3B-Instruct都失败了,vllm 0.10.0不支持这2个模型,请问沐曦官方的vllm0.11什么时候可以升级,Qwen3-Image也没有成功