宿主机上安装完vllm-metax后，如何安装官方的vllm

Members 14 posts

2026年5月27日 09:43 2026年5月27日 09:43

1、我宿主机的操作系统是kylin v10 sp3
2、我的显卡是沐曦的曦云c500
3、我已经安装了vllm-metax-0.19.0
4、官方的vllm：vllm-0.19.0-cp38-abi3-manylinux_2_31_x86_64.whl中，2_31表示glibc >=2.31，但是我这边宿主机的操作系统是kylin v10 sp3，它的glibc默认是2.28版本，结论就是直接安装whl安装不了。而且强行升级glibc版本风险比较大，所以我需要在服务器编译安装官方的vllm
5、我如何在上述环境中编译安装官方的vllm呢？我看了：github.com/MetaX-MACA/vLLM-metax、https://github.com/vllm-project/vllm，但是不知道具体怎么编译安装，因为官方的vllm安装里，没有沐曦显卡的安装选项，急求，谢谢。

link

shuai_chen

Members 521 posts

2026年5月27日 12:56 2026年5月27日 12:56

link

尊敬的开发者您好，请升级宿主机glibc版本或推荐使用镜像部署。

link

LiuWei

Members 14 posts

2026年5月27日 14:32 2026年5月27日 14:32

link

你好，
1、我这边是信创项目，docker不在信创软件名单中，不能使用，必须使用宿主机的方式。
2、因为kylin v10 sp3，它的glibc默认是2.28，要升级到glibc 2.31，也必须在不覆盖不替换系统默认glibc 2.28的基础上，指定自定义路径的方式，而且升级完成后，需要设置环境变量，我不确定对操作系统的影响，所以这种方式风险太大。
3、我目前采用的是编译官方的vllm 0.19.0源码的方式，根据大模型查询的资料显示，编译vllm源码的可以在glibc 2.28的基础上进行。服务器已经安装了沐曦的显卡驱动（metax-driver-3.5.3.11-rpm-x86_64.run
）、sdk（maca-sdk-3.5.3.18-rpm-x86_64.tar.xz
）、cu-bridge（gitee.com/metax-maca/cu-bridge）、vllm-metax（maca-vllm-metax-0.19.0-py310-3.5.3.502-linux-x86_64.tar.xz），mx-smi也是显示正常的。现在编译官方的vllm 0.19.0（github.com/vllm-project/vllm）源码报错，执行的编译命令为：python setup.py install ，报错缺少/opt/maca/tools/cu-bridge/bin/gnu_wrapper。详细报错如一下截图。/opt/maca/tools/cu-bridge/bin/下的具体内容也请看截图。

image.png

PNG, 931.5 KB, uploaded by LiuWei on 2026年5月27日.

image.png

PNG, 1.6 MB, uploaded by LiuWei on 2026年5月27日.

image.png

PNG, 1.4 MB, uploaded by LiuWei on 2026年5月27日.

image.png

PNG, 1.6 MB, uploaded by LiuWei on 2026年5月27日.

link

shuai_chen

Members 521 posts

2026年5月27日 15:22 2026年5月27日 15:22

link

尊敬的开发者您好，请提供cu-bridge设置的相关环境变量

link

LiuWei

Members 14 posts

2026年5月28日 08:46 2026年5月28日 08:46

link

好的，cu-bridge环境变量如图：

ccdf5cae490a408154b182299751abae.jpg

JPG, 286.7 KB, uploaded by LiuWei on 2026年5月28日.

link

shuai_chen

Members 521 posts

2026年5月28日 10:07 2026年5月28日 10:07

link

尊敬的开发者您好，请先按照cu-bridge使用文档里的样例使用操作，是否能编译，确保cu-bridge安装使用正常

link

LiuWei

Members 14 posts

2026年5月28日 11:14 2026年5月28日 11:14

link

好的，
1、我安装的metax-maca/cu-bridge是3.5.3
2、现在我按照cu-bridge使用文档里的样例使用操作，看是否能编译。那我需要下载 github.com/NVIDIA/cuda-samples 下的哪个版本呢？我查了大模型的资料，显示可以下载编译cuda-samples的v11.5、v11.6，是否准确呢

link

shuai_chen

Members 521 posts

2026年5月28日 11:15 2026年5月28日 11:15

link

尊敬的开发者您好，正确

link

LiuWei

Members 14 posts

2026年5月28日 14:49 2026年5月28日 14:49

link

你好，服务器编译了cuda-samples的v11.5，但是报错，详细记录如下：

1、cu-bridge的环境变量如下：
export MACA_CLANG_PATH=/opt/maca/mxgpu_llvm/bin
export CUCC_PATH=/opt/maca/tools/cu-bridge
export MACA_PATH=/opt/maca
export CUDA_PATH=/opt/maca/tools/cu-bridge
export LD_LIBRARY_PATH=/opt/maca/mxgpu_llvm/lib:/opt/maca/ompi/lib/:/opt/maca/lib:${LD_LIBRARY_PATH}
export PATH=/opt/maca/tools/cu-bridge/tools:/opt/maca/bin:/opt/maca/mxgpu_llvm/bin/:${CUDA_PATH}/bin:$PATH
export CUCC_CMAKE_ENTRY=2

2、执行 which mxcc 的结果如下：
(vllm_clone_0.19.0) [root@localhost workspace]# which mxcc
/opt/maca/mxgpu_llvm/bin/mxcc

3、执行 macainfo 的结果如下：
(vllm_clone_0.19.0) [root@localhost workspace]# macainfo
======================
MXC System Attributes
======================
Runtime Version: 1.0
System Timestamp Freq: 1000MHz
Signal Max Wait Time: 18446744073709551615(0xffffffffffffffff)
Machine Model: LARGE
System Endianess: LITTLE

Agent 1

Name: HYGON C86-4G (OPN:7491)
Uuid: 4350552d-5858-0000-0000-000000000000
Market Name: Hygon C86-4G (OPN:7491)
Vendor Name: CPU
Feature: Not Specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0
Queue Max Size: 0
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq(MHz): 0
BDFID: 0
Internal Node ID: 0
Accelerator Processors: 32
PEUs per AP: 0
Data Processor Clustes(DPCs): 0
DPC Arrays per DPC.: 0
Watch Pointers on Address Ranges: 1
Pool Info:
Pool 1
Segment: GLOBAL
Flags: FINE_GRAINED
Size: 2321540(0x236c84) KB
Allocatable: TRUE
Alloc Granule: 4 KB
Alloc Alignment: 4 KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL
Flags: FINE_GRAINED
Size: 2321540(0x236c84) KB
Allocatable: TRUE
Alloc Granule: 4 KB
Alloc Alignment: 4 KB
Accessible by all: TRUE
ISA Info:
N/A

4、在 cuda-samples-11.5 目录下，执行 make_maca 编译，报错如下：

(vllm_clone_0.19.0) [root@localhost cuda-samples-11.5]# make_maca
make maca verion:0220 ...
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes”

GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes_nvrtc”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes_nvrtc”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_CrossGPU”
WARNING - libEGL.so not found, please install libEGL.so <<<
WARNING - egl.h not found, please install egl.h <<<
WARNING - eglext.h not found, please install eglext.h <<<
GCC Version is greater or equal to 4.8.0 <<<
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o cuda_consumer.o -c cuda_consumer.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o cuda_producer.o -c cuda_producer.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o eglstrm_common.o -c eglstrm_common.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o kernel.o -c kernel.cu
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o main.o -c main.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o EGLStream_CUDA_CrossGPU cuda_consumer.o cuda_producer.o eglstrm_common.o kernel.o main.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp EGLStream_CUDA_CrossGPU ../../bin/x86_64/linux/release
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_CrossGPU”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_Interop”
WARNING - libEGL.so not found, please install libEGL.so <<<
WARNING - egl.h not found, please install egl.h <<<
WARNING - eglext.h not found, please install eglext.h <<<
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o cuda_consumer.o -c cuda_consumer.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o cuda_producer.o -c cuda_producer.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o eglstrm_common.o -c eglstrm_common.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o main.o -c main.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -o EGLStream_CUDA_Interop cuda_consumer.o cuda_producer.o eglstrm_common.o main.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp EGLStream_CUDA_Interop ../../bin/x86_64/linux/release
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_Interop”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLStreams_CUDA_Interop”
WARNING - libEGL.so not found, please install libEGL.so <<<
WARNING - egl.h not found, please install egl.h <<<
WARNING - eglext.h not found, please install eglext.h <<<
GCC Version is greater or equal to 4.8.0 <<<
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o cuda_consumer.o -c cuda_consumer.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o cuda_producer.o -c cuda_producer.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o eglstrm_common.o -c eglstrm_common.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o main.o -c main.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -o EGLStream_CUDA_Interop cuda_consumer.o cuda_producer.o eglstrm_common.o main.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp EGLStream_CUDA_Interop ../../bin/x86_64/linux/release
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLStreams_CUDA_Interop”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLSync_CUDAEvent_Interop”
WARNING - EGLSync_CUDAEvent_Interop is not supported on Linux x86_64 - waiving sample <<<
WARNING - libEGL.so not found, please install libEGL.so <<<
WARNING - egl.h not found, please install egl.h <<<
WARNING - eglext.h not found, please install eglext.h <<<
GCC Version is greater or equal to 4.8.0 <<<
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o EGLSync_CUDAEvent_Interop.o -c EGLSync_CUDAEvent_Interop.cu
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o EGLSync_CUDAEvent_Interop EGLSync_CUDAEvent_Interop.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda -lX11 -lGLESv2
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp EGLSync_CUDAEvent_Interop ../../bin/x86_64/linux/release
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLSync_CUDAEvent_Interop”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/FDTD3d”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/FDTD3d”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/FilterBorderControlNPP”
GCC Version is greater or equal to 4.8.0 <<<
test.c:1:10: fatal error: 'FreeImage.h' file not found
1 | #include "FreeImage.h"
| ^~~
1 error generated.
WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -I../../Common/UtilNPP -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=compute_35 -o FilterBorderControlNPP.o -c FilterBorderControlNPP.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=compute_35 -o FilterBorderControlNPP FilterBorderControlNPP.o -lnppisu_static -lnppif_static -lnppitc_static -lnppidei_static -lnppial_static -lnppc_static -lculibos -lfreeimage
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp FilterBorderControlNPP ../../bin/x86_64/linux/release
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/FilterBorderControlNPP”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/FunctionPointers”
WARNING - libGL.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
WARNING - libGLU.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
WARNING - gl.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
WARNING - glu.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
GCC Version is greater or equal to 4.8.0 <<<
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o FunctionPointers.o -c FunctionPointers.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o FunctionPointers_kernels.o -c FunctionPointers_kernels.cu
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o FunctionPointers FunctionPointers.o FunctionPointers_kernels.o -lGL -lGLU -lglut
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp FunctionPointers ../../bin/x86_64/linux/release
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/FunctionPointers”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/HSOpticalFlow”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/HSOpticalFlow”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineP”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineP”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineQ”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineQ”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiP”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiP”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiQ”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiQ”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_SingleAsianOptionP”
GCC Version is greater or equal to 4.8.0 <<<
make[1]: 对“all”无需做任何事。
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_SingleAsianOptionP”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/Mandelbrot”
WARNING - libGL.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
WARNING - libGLU.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
WARNING - gl.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
WARNING - glu.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
GCC Version is greater or equal to 4.8.0 <<<
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot.o -c Mandelbrot.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot_cuda.o -c Mandelbrot_cuda.cu
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot_gold.o -c Mandelbrot_gold.cpp
[@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot Mandelbrot.o Mandelbrot_cuda.o Mandelbrot_gold.o -lGL -lGLU -lglut
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp Mandelbrot ../../bin/x86_64/linux/release
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/Mandelbrot”
make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MersenneTwisterGP11213”
GCC Version is greater or equal to 4.8.0 <<<
/root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=compute_35 -o MersenneTwister.o -c MersenneTwister.cpp
mxcc: warning: argument unused during compilation: '-Xdevice -D__CUDA_ARCH__=700' [-Wunused-command-line-argument]
mxcc: warning: argument unused during compilation: '--offload-arch=xcore1000' [-Wunused-command-line-argument]
In file included from MersenneTwister.cpp:43:
../../Common/helper_cuda.h:240:11: warning: enumeration value 'MCRAND_STATUS_NOT_IMPLEMENTED' not handled in switch [-Wswitch]
240 | switch (error) {
| ^~~
MersenneTwister.cpp:62:15: error: use of undeclared identifier 'findCudaDevice'
62 | int devID = findCudaDevice(argc, (const char )argv);
| ^
MersenneTwister.cpp:83:3: error: use of undeclared identifier 'checkCudaErrors'
83 | checkCudaErrors(cudaStreamCreateWithFlags(&stream, cudaStreamNonBlocking));
| ^
MersenneTwister.cpp:86:3: error: use of undeclared identifier 'checkCudaErrors'
86 | checkCudaErrors(cudaMalloc((void )&d_Rand, rand_n * sizeof(float)));
| ^
MersenneTwister.cpp:89:3: error: use of undeclared identifier 'checkCudaErrors'
89 | checkCudaErrors(curandCreateGenerator(&prngGPU, CURAND_RNG_PSEUDO_MTGP32));
| ^
MersenneTwister.cpp:90:3: error: use of undeclared identifier 'checkCudaErrors'
90 | checkCudaErrors(curandSetStream(prngGPU, stream));
| ^
MersenneTwister.cpp:91:3: error: use of undeclared identifier 'checkCudaErrors'
91 | checkCudaErrors(curandSetPseudoRandomGeneratorSeed(prngGPU, seed));
| ^
MersenneTwister.cpp:94:3: error: use of undeclared identifier 'checkCudaErrors'
94 | checkCudaErrors(
| ^
MersenneTwister.cpp:96:3: error: use of undeclared identifier 'checkCudaErrors'
96 | checkCudaErrors(curandSetPseudoRandomGeneratorSeed(prngCPU, seed));
| ^
MersenneTwister.cpp:101:3: error: use of undeclared identifier 'checkCudaErrors'
101 | checkCudaErrors(cudaMallocHost(&h_RandGPU, rand_n * sizeof(float)));
| ^
MersenneTwister.cpp:104:3: error: use of undeclared identifier 'checkCudaErrors'
104 | checkCudaErrors(curandGenerateUniform(prngGPU, (float )d_Rand, rand_n));
| ^
MersenneTwister.cpp:107:3: error: use of undeclared identifier 'checkCudaErrors'
107 | checkCudaErrors(cudaMemcpyAsync(h_RandGPU, d_Rand, rand_n * sizeof(float),
| ^
MersenneTwister.cpp:113:3: error: use of undeclared identifier 'checkCudaErrors'
113 | checkCudaErrors(curandGenerateUniform(prngCPU, (float )h_RandCPU, rand_n));
| ^
MersenneTwister.cpp:115:3: error: use of undeclared identifier 'checkCudaErrors'
115 | checkCudaErrors(cudaStreamSynchronize(stream));
| ^
MersenneTwister.cpp:130:5: error: use of undeclared identifier 'checkCudaErrors'
130 | checkCudaErrors(curandGenerateUniform(prngGPU, (float *)d_Rand, rand_n));
| ^
MersenneTwister.cpp:133:3: error: use of undeclared identifier 'checkCudaErrors'
133 | checkCudaErrors(cudaStreamSynchronize(stream));
| ^
MersenneTwister.cpp:145:3: error: use of undeclared identifier 'checkCudaErrors'
145 | checkCudaErrors(curandDestroyGenerator(prngGPU));
| ^
MersenneTwister.cpp:146:3: error: use of undeclared identifier 'checkCudaErrors'
146 | checkCudaErrors(curandDestroyGenerator(prngCPU));
| ^
MersenneTwister.cpp:147:3: error: use of undeclared identifier 'checkCudaErrors'
147 | checkCudaErrors(cudaStreamDestroy(stream));
| ^
MersenneTwister.cpp:148:3: error: use of undeclared identifier 'checkCudaErrors'
148 | checkCudaErrors(cudaFree(d_Rand));
| ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
1 warning and 20 errors generated.
make[1]: *** [Makefile:346：MersenneTwister.o] 错误 1
make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MersenneTwisterGP11213”
make: *** [Makefile:45：Samples/MersenneTwisterGP11213/Makefile.ph_build] 错误 2
(vllm_clone_0.19.0) [root@localhost cuda-samples-11.5]#

link

shuai_chen

Members 521 posts

2026年5月28日 15:40 2026年5月28日 15:40

link

尊敬的开发者您好，请将完整的安装步骤描述一下

link

LiuWei

Members 14 posts

2026年5月28日 16:21 2026年5月28日 16:21

link

服务器之前已经在使用旧版本的vllm运行了qwen3大模型。
本次升级过程：
1、卸载原有驱动、sdk，安装新版本：metax-driver-3.5.3.11-rpm-x86_64.run、maca-sdk-3.5.3.18-rpm-x86_64.tar.xz
2、外网服务器新建conda虚拟环境，python=3.10，安装Transformers、
maca-pytorch2.6-py310-3.5.3.9-x86_64.tar.xz、maca-vllm-metax-0.19.0-py310-3.5.3.502-linux-x86_64.tar.xz，使用conda pack打包虚拟环境，拷贝到内网服务器，conda-unpack解压
3、下载cu-bridge的3.5.3版本的包：
gitee.com/metax-maca/cu-bridge
编译过程如下：
unzip 3.1.0.zip
mv cu-bridge-3.1.0 cu-bridge
sudo chmod 755 cu-bridge -Rf
cd cu-bridge
mkdir build && cd ./build
cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
make && make install

4、下载了：cuda-samples的11.5版本
github.com/NVIDIA/cuda-samples

4.1 检查MXMACA编译器是否正确安装：
which mxcc

4.2 检查MXMACA运行环境是否正常：
macainfo

4.3 检查cu-bridge的环境变量

4.1——4.3的三个输出结果都在上一个回复帖的日志中。

5、解压cuda-samples-11.5后
cd cuda-samples-11.5
执行编译：
make_maca

image.png

PNG, 148.4 KB, uploaded by LiuWei on 2026年5月28日.

link

shuai_chen

Members 521 posts

2026年5月28日 16:31 2026年5月28日 16:31

link

尊敬的开发者您好，cu-bridge测试请编译vectorAdd，按照gitee仓库使用指南里的步骤

link

LiuWei

Members 14 posts

2026年5月28日 16:40 2026年5月28日 16:40

link

抱歉，之前没有找到vectorAdd的位置。
刚才进入到vectorAdd里，执行：make_maca后，执行成功，生成了：vectorAdd，执行：./vectorAdd 后，输出：Test PASSED Done，应该是说明现在的cu-bridge是没有问题的

link

LiuWei

Members 14 posts

2026年5月28日 16:54 2026年5月28日 16:54

link

现在又重新回到了开始的那个问题，就是编译官方的vllm报错的问题，在上面的帖中有报错的截图，上面的贴链接如下：
developer.metax-tech.com/forum/t/su-zhu-ji-shang-an-zhuang-wan-vllm-metaxhou-ru-he-an-zhuang-guan-fang-de-vllm/516/post/2308/

link

shuai_chen

Members 521 posts

2026年5月29日 11:38 2026年5月29日 11:38

link

尊敬的开发者您好，
一。你目前的做法是直接去编译官方 vLLM，但官方 vLLM 的 setup.py 里没有 MACA 相关的任何逻辑。你需要在 vllm-metax 仓库里执行编译。
正确的编译顺序是：

先编译 vllm-metax （提供 MACA 专用算子 _C 、 _moe_C 等）
再编译官方 vLLM ，但使用 VLLM_TARGET_DEVICE=empty （只用纯 Python 层，算子由 vllm-metax 提供）

二。cu-bridge 未正确编译
从 vllm-metax 的 setup.py 可以看到：

CMAKE_EXECUTABLE = "cmake" if not USE_MACA else "cmake_maca"

在 MACA 环境下，使用的是 cmake_maca 而不是系统 cmake。 cmake_maca 是 MACA SDK 提供的 cmake 包装器，它在编译 .cu 文件时，会调用 cu-bridge 的 gnu_wrapper 将 CUDA 源码转换为 MACA 设备代码。
gnu_wrapper 不是 cu-bridge 源码中自带的可执行文件，而是在编译 cu-bridge 时由 cmake 构建过程自动生成的。如果你安装 cu-bridge 的方式是直接解压预编译包（而没有在本地执行 cmake + make），那么 gnu_wrapper 就不会生成。
解决方案
步骤 1：重新从源码编译 cu-bridge

下载 cu-bridge 源码（版本号需和你的 MACA SDK 匹配，你的是 3.5.3）
cd /tmp
wget gitee.com/metax-maca/cu-bridge/repository/archive/3.5.3.zip
unzip 3.5.3.zip
mv cu-bridge-3.5.3 cu-bridge
chmod 755 cu-bridge -Rf
cd cu-bridge
mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
make -j$(nproc) && make install

编译完成后，确认 /opt/maca/tools/cu-bridge/bin/gnu_wrapper 是否存在：

ls -la /opt/maca/tools/cu-bridge/bin/gnu_wrapper

步骤 2：设置 MACA 环境变量

export MACA_PATH="/opt/maca"
export CUCC_PATH="${MACA_PATH}/tools/cu-bridge"
export CUDA_PATH="${HOME}/cu-bridge/CUDA_DIR"
export CUCC_CMAKE_ENTRY=2

export PATH=${MACA_PATH}/mxgpu_llvm/bin:${MACA_PATH}/bin:${CUCC_PATH}/tools:${CUCC_PATH}/bin:${PATH}
export LD_LIBRARY_PATH=${MACA_PATH}/lib:${MACA_PATH}/ompi/lib:${MACA_PATH}/mxgpu_llvm/lib:${LD_LIBRARY_PATH}

export VLLM_INSTALL_PUNICA_KERNELS=1

步骤 3：编译 vllm-metax（沐曦插件）

cd /path/to/vLLM-metax
python use_existing_metax.py
pip install -r requirements/build.txt
pip install . --no-build-isolation

这里 Maca 版本的 torch 中 torch.utils.cpp_extension 应已包含 MACA_HOME ，这样 USE_MACA=True 就会生效，使用 cmake_maca 编译。
步骤 4：编译官方 vLLM（空设备模式）示例：

git clone --depth 1 --branch releases/v0.20.0 github.com/vllm-project/vllm
cd vllm
python use_existing_pytorch.py
pip install -r requirements/build.txt
VLLM_TARGET_DEVICE=empty pip install . --no-build-isolation

注意：这里的关键是 VLLM_TARGET_DEVICE=empty ，它让官方 vLLM 跳过所有 GPU 算子的编译（这些算子由 vllm-metax 提供），只安装纯 Python 层的代码。

link

LiuWei

Members 14 posts

2026年5月29日 22:30 2026年5月29日 22:30

link

好的，非常感谢[合十]，周一开始按照这个步骤重新执行

link

LiuWei

Members 14 posts

2026年6月1日 15:54 2026年6月1日 15:54

link

你好，
刚执行了步骤 1：重新从源码编译 cu-bridge，但是执行完步骤一后，发现 ls -la /opt/maca/tools/cu-bridge/bin/gnu_wrapper 依然不存在。
结论：/opt/maca/tools/cu-bridge/bin/gnu_wrapper不存在，后续编译官方的vllm依然后报上述的错误

详细执行过程的日志如下：
1、将原有的/opt/maca/tools/cu-bridge/备份
(vllm_clone_0.19.0) [root@localhost build]# mv /opt/maca/tools/cu-bridge/ /opt/maca/tools/cu-bridge.bak

2、查看make和cmake的版本：
(vllm_clone_0.19.0) [root@localhost build]# make -version
GNU Make 4.3
为 x86_64-koji-linux-gnu 编译
Copyright (C) 1988-2020 Free Software Foundation, Inc.
许可证：GPLv3+：GNU 通用公共许可证第 3 版或更新版本gnu.org/licenses/gpl.html。
本软件是自由软件：您可以自由修改和重新发布它。
在法律允许的范围内没有其他保证。

(vllm_clone_0.19.0) [root@localhost bin]# cmake -version
cmake version 3.16.5
CMake suite maintained and supported by Kitware (kitware.com/cmake).

3、从gitee中下载cu-bridge-3.5.3.zip，大小为：12428135 字节，下载地址：gitee.com/metax-maca/cu-bridge/repository/archive/3.5.3.zip，
(vllm_clone_0.19.0) [root@localhost opt]# ll
-rw-r--r-- 1 root root 12428135 6月 1 13:23 cu-bridge-3.5.3.zip

解压缩：
(vllm_clone_0.19.0) [root@localhost opt]# unzip cu-bridge-3.5.3.zip

4、项目重命名
(vllm_clone_0.19.0) [root@localhost opt]# mv cu-bridge-3.5.3 cu-bridge

5、修改权限
(vllm_clone_0.19.0) [root@localhost opt]# chmod 755 cu-bridge -Rf

6、创建build目录
(vllm_clone_0.19.0) [root@localhost opt]# cd cu-bridge
(vllm_clone_0.19.0) [root@localhost cu-bridge]# mkdir build
(vllm_clone_0.19.0) [root@localhost cu-bridge]# cd build/

7、执行cmake
(vllm_clone_0.19.0) [root@localhost build]# cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
CMake Warning (dev) in CMakeLists.txt:
No project() command is present. The top-level CMakeLists.txt file must
contain a literal, direct call to the project() command. Add a line of
code such as

project(ProjectName)

near the top of the file, but after cmake_minimum_required().

CMake is pretending there is a "project(Project)" command on the first
line.
This warning is for project developers. Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
cmake_minimum_required() should be called prior to this top-level project()
call. Please see the cmake-commands(7) manual for usage documentation of
both commands.
This warning is for project developers. Use -Wno-dev to suppress it.

-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DEST_DIR:./
-- MACA_PATH=/opt/maca
-- PACKAGE_GENERATOR=DEB
-- CMAKE_TARGET_ARCH=
-- Ver=[3.5.3.18] in [/opt/maca/Version.txt]
-- Ver=[3.5.3] in [/opt/cu-bridge/Version.txt]
-- CMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge
-- MACA_PATH=/opt/maca
-- CUPRJ_INCLUDE: /opt/cu-bridge/src/bridge/tools_ext/../../../include/
-- Configuring done (0.5s)
-- Generating done (0.0s)
-- Build files have been written to: /opt/cu-bridge/build

8、执行make
具体的make日志见附件：cu-bridge.log，因为回复框有60000的字数限制。

9、查看/opt/maca/tools/cu-bridge/
(vllm_clone_0.19.0) [root@localhost build]# ll /opt/maca/tools/cu-bridge/bin
conf_bazel.json
conf_gfortran.json
conf_gnu_bazel.json
conf_gnu.json
conf.json
conf_regex.yaml
cucc
gfortran
gnu
gomxccbin
note.md
process_args.py

insert_drive_file

cu-bridge.log

Text, 164.4 KB, uploaded by LiuWei on 2026年6月1日.

link

shuai_chen

Members 521 posts

2026年6月1日 20:30 2026年6月1日 20:30

link

尊敬的开发者您好，请忽略此错误
执行以下步骤
配置环境变量

# setup MACA path
export MACA_PATH="/opt/maca"

# cu-bridge
export CUCC_PATH="${MACA_PATH}/tools/cu-bridge"
export CUDA_PATH="${HOME}/cu-bridge/CUDA_DIR"
export CUCC_CMAKE_ENTRY=2

# update PATH
export PATH=${MACA_PATH}/mxgpu_llvm/bin:${MACA_PATH}/bin:${CUCC_PATH}/tools:${CUCC_PATH}/bin:${PATH}
export LD_LIBRARY_PATH=${MACA_PATH}/lib:${MACA_PATH}/ompi/lib:${MACA_PATH}/mxgpu_llvm/lib:${LD_LIBRARY_PATH}

编译vllm

# clone vllm repository
git clone https://github.com/vllm/vllm.git
cd vllm
git checkout v0.19.0

python use_existing_pytorch.py
pip install -r requirements/build.txt
VLLM_TARGET_DEVICE=empty pip install . --no-build-isolation

编译vLLM-metax

git clone --branch v0.19.0 https://github.com/MetaX-MACA/vLLM-metax
cd vLLM-metax
git checkout v0.19.0
python use_existing_metax.py
pip install -r requirements/build.txt
VLLM_TARGET_DEVICE=empty pip install . --no-build-isolation

link

LiuWei

Members 14 posts

2026年6月2日 09:32 2026年6月2日 09:32

link

你好，我上面的描述可能没有很清晰。
我执行了你提供的步骤1：

步骤 1：重新从源码编译 cu-bridge

下载 cu-bridge 源码（版本号需和你的 MACA SDK 匹配，你的是 3.5.3）
cd /tmp
wget gitee.com/metax-maca/cu-bridge/repository/archive/3.5.3.zip
unzip 3.5.3.zip
mv cu-bridge-3.5.3 cu-bridge
chmod 755 cu-bridge -Rf
cd cu-bridge
mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
make -j$(nproc) && make install

编译完成后， /opt/maca/tools/cu-bridge/bin/gnu_wrapper 依然不存在，所以后续编译vllm还是会报缺少gnu_wrapper的错误吧

link

shuai_chen

Members 521 posts

2026年6月2日 09:33 2026年6月2日 09:33

link

尊敬的开发者您好，您是在vllm编译还是vllm-metax编译

link

LiuWei

Members 14 posts

2026年6月2日 09:36 2026年6月2日 09:36

link

你好，按照你的回复：developer.metax-tech.com/forum/t/su-zhu-ji-shang-an-zhuang-wan-vllm-metaxhou-ru-he-an-zhuang-guan-fang-de-vllm/516/post/2345/

一共4步，我在执行第一步：重新从源码编译 cu-bridge后，发现/opt/maca/tools/cu-bridge/bin/gnu_wrapper 依然不存在。
而后续的3步（步骤 2：设置 MACA 环境变量、步骤 3：编译 vllm-metax（沐曦插件）、步骤 4：编译官方 vLLM（空设备模式））还没有执行。

link

shuai_chen

Members 521 posts

2026年6月2日 09:38 2026年6月2日 09:38

link

尊敬的开发者您好，忽略报错，继续执行后续步骤尝试

link

LiuWei

Members 14 posts

2026年6月2日 14:06 2026年6月2日 14:06

link

你好，我已完成你上方帖中的步骤1和2：
步骤 1：重新从源码编译 cu-bridge（但是没有生成 /opt/maca/tools/cu-bridge/bin/gnu_wrapper）
步骤 2：设置 MACA 环境变量