• Members 14 posts
    2026年5月27日 09:43

    1、我宿主机的操作系统是kylin v10 sp3
    2、我的显卡是沐曦的曦云c500
    3、我已经安装了vllm-metax-0.19.0
    4、官方的vllm:vllm-0.19.0-cp38-abi3-manylinux_2_31_x86_64.whl中,2_31表示glibc >=2.31,但是我这边宿主机的操作系统是kylin v10 sp3,它的glibc默认是2.28版本,结论就是直接安装whl安装不了。而且强行升级glibc版本风险比较大,所以我需要在服务器编译安装官方的vllm
    5、我如何在上述环境中编译安装官方的vllm呢?我看了:github.com/MetaX-MACA/vLLM-metax、https://github.com/vllm-project/vllm,但是不知道具体怎么编译安装,因为官方的vllm安装里,没有沐曦显卡的安装选项,急求,谢谢。

  • arrow_forward

    Thread has been moved from 产品&运维.

  • Members 521 posts
    2026年5月27日 12:56

    尊敬的开发者您好,请升级宿主机glibc版本或推荐使用镜像部署。

  • Members 14 posts
    2026年5月27日 14:32

    你好,
    1、我这边是信创项目,docker不在信创软件名单中,不能使用,必须使用宿主机的方式。
    2、因为kylin v10 sp3,它的glibc默认是2.28,要升级到glibc 2.31,也必须在不覆盖不替换系统默认glibc 2.28的基础上,指定自定义路径的方式,而且升级完成后,需要设置环境变量,我不确定对操作系统的影响,所以这种方式风险太大。
    3、我目前采用的是编译官方的vllm 0.19.0源码的方式,根据大模型查询的资料显示,编译vllm源码的可以在glibc 2.28的基础上进行。服务器已经安装了沐曦的显卡驱动(metax-driver-3.5.3.11-rpm-x86_64.run
    )、sdk(maca-sdk-3.5.3.18-rpm-x86_64.tar.xz
    )、cu-bridge(gitee.com/metax-maca/cu-bridge)、vllm-metax(maca-vllm-metax-0.19.0-py310-3.5.3.502-linux-x86_64.tar.xz),mx-smi也是显示正常的。现在编译官方的vllm 0.19.0(github.com/vllm-project/vllm)源码报错,执行的编译命令为:python setup.py install ,报错缺少/opt/maca/tools/cu-bridge/bin/gnu_wrapper。详细报错如一下截图。/opt/maca/tools/cu-bridge/bin/下的具体内容也请看截图。

    image.png

    PNG, 931.5 KB, uploaded by LiuWei on 2026年5月27日.

    image.png

    PNG, 1.6 MB, uploaded by LiuWei on 2026年5月27日.

    image.png

    PNG, 1.4 MB, uploaded by LiuWei on 2026年5月27日.

    image.png

    PNG, 1.6 MB, uploaded by LiuWei on 2026年5月27日.

  • Members 521 posts
    2026年5月27日 15:22

    尊敬的开发者您好,请提供cu-bridge设置的相关环境变量

  • Members 14 posts
    2026年5月28日 08:46

    好的,cu-bridge环境变量如图:

    ccdf5cae490a408154b182299751abae.jpg

    JPG, 286.7 KB, uploaded by LiuWei on 2026年5月28日.

  • Members 521 posts
    2026年5月28日 10:07

    尊敬的开发者您好,请先按照cu-bridge使用文档里的样例使用操作,是否能编译,确保cu-bridge安装使用正常

  • Members 14 posts
    2026年5月28日 11:14

    好的,
    1、我安装的metax-maca/cu-bridge是3.5.3
    2、现在我按照cu-bridge使用文档里的样例使用操作,看是否能编译。那我需要下载 github.com/NVIDIA/cuda-samples 下的哪个版本呢?我查了大模型的资料,显示可以下载编译cuda-samples的v11.5、v11.6,是否准确呢

  • Members 14 posts
    2026年5月28日 14:49

    你好,服务器编译了cuda-samples的v11.5,但是报错,详细记录如下:

    1、cu-bridge的环境变量如下:
    export MACA_CLANG_PATH=/opt/maca/mxgpu_llvm/bin
    export CUCC_PATH=/opt/maca/tools/cu-bridge
    export MACA_PATH=/opt/maca
    export CUDA_PATH=/opt/maca/tools/cu-bridge
    export LD_LIBRARY_PATH=/opt/maca/mxgpu_llvm/lib:/opt/maca/ompi/lib/:/opt/maca/lib:${LD_LIBRARY_PATH}
    export PATH=/opt/maca/tools/cu-bridge/tools:/opt/maca/bin:/opt/maca/mxgpu_llvm/bin/:${CUDA_PATH}/bin:$PATH
    export CUCC_CMAKE_ENTRY=2

    2、执行 which mxcc 的结果如下:
    (vllm_clone_0.19.0) [root@localhost workspace]# which mxcc
    /opt/maca/mxgpu_llvm/bin/mxcc

    3、执行 macainfo 的结果如下:
    (vllm_clone_0.19.0) [root@localhost workspace]# macainfo
    ======================
    MXC System Attributes
    ======================
    Runtime Version: 1.0
    System Timestamp Freq: 1000MHz
    Signal Max Wait Time: 18446744073709551615(0xffffffffffffffff)
    Machine Model: LARGE
    System Endianess: LITTLE


    Agent 1


    Name: HYGON C86-4G (OPN:7491)
    Uuid: 4350552d-5858-0000-0000-000000000000
    Market Name: Hygon C86-4G (OPN:7491)
    Vendor Name: CPU
    Feature: Not Specified
    Profile: FULL_PROFILE
    Float Round Mode: NEAR
    Max Queue Number: 0(0x0)
    Queue Min Size: 0
    Queue Max Size: 0
    Queue Type: MULTI
    Node: 0
    Device Type: CPU
    Cache Info:
    Chip ID: 0(0x0)
    Cacheline Size: 64(0x40)
    Max Clock Freq(MHz): 0
    BDFID: 0
    Internal Node ID: 0
    Accelerator Processors: 32
    PEUs per AP: 0
    Data Processor Clustes(DPCs): 0
    DPC Arrays per DPC.: 0
    Watch Pointers on Address Ranges: 1
    Pool Info:
    Pool 1
    Segment: GLOBAL
    Flags: FINE_GRAINED
    Size: 2321540(0x236c84) KB
    Allocatable: TRUE
    Alloc Granule: 4 KB
    Alloc Alignment: 4 KB
    Accessible by all: TRUE
    Pool 2
    Segment: GLOBAL
    Flags: FINE_GRAINED
    Size: 2321540(0x236c84) KB
    Allocatable: TRUE
    Alloc Granule: 4 KB
    Alloc Alignment: 4 KB
    Accessible by all: TRUE
    ISA Info:
    N/A


    4、在 cuda-samples-11.5 目录下,执行 make_maca 编译,报错如下:

    (vllm_clone_0.19.0) [root@localhost cuda-samples-11.5]# make_maca
    make maca verion:0220 ...
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes”

    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes_nvrtc”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/BlackScholes_nvrtc”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_CrossGPU”
    WARNING - libEGL.so not found, please install libEGL.so <<<
    WARNING - egl.h not found, please install egl.h <<<
    WARNING - eglext.h not found, please install eglext.h <<<
    GCC Version is greater or equal to 4.8.0 <<<
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o cuda_consumer.o -c cuda_consumer.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o cuda_producer.o -c cuda_producer.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o eglstrm_common.o -c eglstrm_common.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o kernel.o -c kernel.cu
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o main.o -c main.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o EGLStream_CUDA_CrossGPU cuda_consumer.o cuda_producer.o eglstrm_common.o kernel.o main.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda
    [@] mkdir -p ../../bin/x86_64/linux/release
    [@] cp EGLStream_CUDA_CrossGPU ../../bin/x86_64/linux/release
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_CrossGPU”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_Interop”
    WARNING - libEGL.so not found, please install libEGL.so <<<
    WARNING - egl.h not found, please install egl.h <<<
    WARNING - eglext.h not found, please install eglext.h <<<
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o cuda_consumer.o -c cuda_consumer.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o cuda_producer.o -c cuda_producer.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o eglstrm_common.o -c eglstrm_common.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 -o main.o -c main.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -o EGLStream_CUDA_Interop cuda_consumer.o cuda_producer.o eglstrm_common.o main.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda
    [@] mkdir -p ../../bin/x86_64/linux/release
    [@] cp EGLStream_CUDA_Interop ../../bin/x86_64/linux/release
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLStream_CUDA_Interop”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLStreams_CUDA_Interop”
    WARNING - libEGL.so not found, please install libEGL.so <<<
    WARNING - egl.h not found, please install egl.h <<<
    WARNING - eglext.h not found, please install eglext.h <<<
    GCC Version is greater or equal to 4.8.0 <<<
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o cuda_consumer.o -c cuda_consumer.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o cuda_producer.o -c cuda_producer.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o eglstrm_common.o -c eglstrm_common.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -o main.o -c main.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -o EGLStream_CUDA_Interop cuda_consumer.o cuda_producer.o eglstrm_common.o main.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda
    [@] mkdir -p ../../bin/x86_64/linux/release
    [@] cp EGLStream_CUDA_Interop ../../bin/x86_64/linux/release
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLStreams_CUDA_Interop”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/EGLSync_CUDAEvent_Interop”
    WARNING - EGLSync_CUDAEvent_Interop is not supported on Linux x86_64 - waiving sample <<<
    WARNING - libEGL.so not found, please install libEGL.so <<<
    WARNING - egl.h not found, please install egl.h <<<
    WARNING - eglext.h not found, please install eglext.h <<<
    GCC Version is greater or equal to 4.8.0 <<<
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o EGLSync_CUDAEvent_Interop.o -c EGLSync_CUDAEvent_Interop.cu
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o EGLSync_CUDAEvent_Interop EGLSync_CUDAEvent_Interop.o -lEGL -L/root/cu-bridge/CUDA_DIR//lib64/stubs -lcuda -lX11 -lGLESv2
    [@] mkdir -p ../../bin/x86_64/linux/release
    [@] cp EGLSync_CUDAEvent_Interop ../../bin/x86_64/linux/release
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/EGLSync_CUDAEvent_Interop”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/FDTD3d”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/FDTD3d”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/FilterBorderControlNPP”
    GCC Version is greater or equal to 4.8.0 <<<
    test.c:1:10: fatal error: 'FreeImage.h' file not found
    1 | #include "FreeImage.h"
    | ^~~
    1 error generated.
    WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -I../../Common/UtilNPP -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=compute_35 -o FilterBorderControlNPP.o -c FilterBorderControlNPP.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=compute_35 -o FilterBorderControlNPP FilterBorderControlNPP.o -lnppisu_static -lnppif_static -lnppitc_static -lnppidei_static -lnppial_static -lnppc_static -lculibos -lfreeimage
    [@] mkdir -p ../../bin/x86_64/linux/release
    [@] cp FilterBorderControlNPP ../../bin/x86_64/linux/release
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/FilterBorderControlNPP”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/FunctionPointers”
    WARNING - libGL.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    WARNING - libGLU.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    WARNING - gl.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    WARNING - glu.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    GCC Version is greater or equal to 4.8.0 <<<
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o FunctionPointers.o -c FunctionPointers.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o FunctionPointers_kernels.o -c FunctionPointers_kernels.cu
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o FunctionPointers FunctionPointers.o FunctionPointers_kernels.o -lGL -lGLU -lglut
    [@] mkdir -p ../../bin/x86_64/linux/release
    [@] cp FunctionPointers ../../bin/x86_64/linux/release
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/FunctionPointers”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/HSOpticalFlow”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/HSOpticalFlow”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineP”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineP”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineQ”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiInlineQ”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiP”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiP”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiQ”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_EstimatePiQ”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MC_SingleAsianOptionP”
    GCC Version is greater or equal to 4.8.0 <<<
    make[1]: 对“all”无需做任何事。
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MC_SingleAsianOptionP”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/Mandelbrot”
    WARNING - libGL.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    WARNING - libGLU.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    WARNING - gl.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    WARNING - glu.h not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
    GCC Version is greater or equal to 4.8.0 <<<
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot.o -c Mandelbrot.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot_cuda.o -c Mandelbrot_cuda.cu
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot_gold.o -c Mandelbrot_gold.cpp
    [@] /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o Mandelbrot Mandelbrot.o Mandelbrot_cuda.o Mandelbrot_gold.o -lGL -lGLU -lglut
    [@] mkdir -p ../../bin/x86_64/linux/release
    [@] cp Mandelbrot ../../bin/x86_64/linux/release
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/Mandelbrot”
    make[1]: 进入目录“/data/workspace/cuda-samples-11.5/Samples/MersenneTwisterGP11213”
    GCC Version is greater or equal to 4.8.0 <<<
    /root/cu-bridge/CUDA_DIR//bin/nvcc -ccbin g++ -I../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_35,code=compute_35 -o MersenneTwister.o -c MersenneTwister.cpp
    mxcc: warning: argument unused during compilation: '-Xdevice -D__CUDA_ARCH__=700' [-Wunused-command-line-argument]
    mxcc: warning: argument unused during compilation: '--offload-arch=xcore1000' [-Wunused-command-line-argument]
    In file included from MersenneTwister.cpp:43:
    ../../Common/helper_cuda.h:240:11: warning: enumeration value 'MCRAND_STATUS_NOT_IMPLEMENTED' not handled in switch [-Wswitch]
    240 | switch (error) {
    | ^
    ~~
    MersenneTwister.cpp:62:15: error: use of undeclared identifier 'findCudaDevice'
    62 | int devID = findCudaDevice(argc, (const char )argv);
    | ^
    MersenneTwister.cpp:83:3: error: use of undeclared identifier 'checkCudaErrors'
    83 | checkCudaErrors(cudaStreamCreateWithFlags(&stream, cudaStreamNonBlocking));
    | ^
    MersenneTwister.cpp:86:3: error: use of undeclared identifier 'checkCudaErrors'
    86 | checkCudaErrors(cudaMalloc((void
    )&d_Rand, rand_n * sizeof(float)));
    | ^
    MersenneTwister.cpp:89:3: error: use of undeclared identifier 'checkCudaErrors'
    89 | checkCudaErrors(curandCreateGenerator(&prngGPU, CURAND_RNG_PSEUDO_MTGP32));
    | ^
    MersenneTwister.cpp:90:3: error: use of undeclared identifier 'checkCudaErrors'
    90 | checkCudaErrors(curandSetStream(prngGPU, stream));
    | ^
    MersenneTwister.cpp:91:3: error: use of undeclared identifier 'checkCudaErrors'
    91 | checkCudaErrors(curandSetPseudoRandomGeneratorSeed(prngGPU, seed));
    | ^
    MersenneTwister.cpp:94:3: error: use of undeclared identifier 'checkCudaErrors'
    94 | checkCudaErrors(
    | ^
    MersenneTwister.cpp:96:3: error: use of undeclared identifier 'checkCudaErrors'
    96 | checkCudaErrors(curandSetPseudoRandomGeneratorSeed(prngCPU, seed));
    | ^
    MersenneTwister.cpp:101:3: error: use of undeclared identifier 'checkCudaErrors'
    101 | checkCudaErrors(cudaMallocHost(&h_RandGPU, rand_n * sizeof(float)));
    | ^
    MersenneTwister.cpp:104:3: error: use of undeclared identifier 'checkCudaErrors'
    104 | checkCudaErrors(curandGenerateUniform(prngGPU, (float )d_Rand, rand_n));
    | ^
    MersenneTwister.cpp:107:3: error: use of undeclared identifier 'checkCudaErrors'
    107 | checkCudaErrors(cudaMemcpyAsync(h_RandGPU, d_Rand, rand_n * sizeof(float),
    | ^
    MersenneTwister.cpp:113:3: error: use of undeclared identifier 'checkCudaErrors'
    113 | checkCudaErrors(curandGenerateUniform(prngCPU, (float
    )h_RandCPU, rand_n));
    | ^
    MersenneTwister.cpp:115:3: error: use of undeclared identifier 'checkCudaErrors'
    115 | checkCudaErrors(cudaStreamSynchronize(stream));
    | ^
    MersenneTwister.cpp:130:5: error: use of undeclared identifier 'checkCudaErrors'
    130 | checkCudaErrors(curandGenerateUniform(prngGPU, (float *)d_Rand, rand_n));
    | ^
    MersenneTwister.cpp:133:3: error: use of undeclared identifier 'checkCudaErrors'
    133 | checkCudaErrors(cudaStreamSynchronize(stream));
    | ^
    MersenneTwister.cpp:145:3: error: use of undeclared identifier 'checkCudaErrors'
    145 | checkCudaErrors(curandDestroyGenerator(prngGPU));
    | ^
    MersenneTwister.cpp:146:3: error: use of undeclared identifier 'checkCudaErrors'
    146 | checkCudaErrors(curandDestroyGenerator(prngCPU));
    | ^
    MersenneTwister.cpp:147:3: error: use of undeclared identifier 'checkCudaErrors'
    147 | checkCudaErrors(cudaStreamDestroy(stream));
    | ^
    MersenneTwister.cpp:148:3: error: use of undeclared identifier 'checkCudaErrors'
    148 | checkCudaErrors(cudaFree(d_Rand));
    | ^
    fatal error: too many errors emitted, stopping now [-ferror-limit=]
    1 warning and 20 errors generated.
    make[1]: *** [Makefile:346:MersenneTwister.o] 错误 1
    make[1]: 离开目录“/data/workspace/cuda-samples-11.5/Samples/MersenneTwisterGP11213”
    make: *** [Makefile:45:Samples/MersenneTwisterGP11213/Makefile.ph_build] 错误 2
    (vllm_clone_0.19.0) [root@localhost cuda-samples-11.5]#

  • Members 521 posts
    2026年5月28日 15:40

    尊敬的开发者您好,请将完整的安装步骤描述一下

  • Members 14 posts
    2026年5月28日 16:21

    服务器之前已经在使用旧版本的vllm运行了qwen3大模型。
    本次升级过程:
    1、卸载原有驱动、sdk,安装新版本:metax-driver-3.5.3.11-rpm-x86_64.run、maca-sdk-3.5.3.18-rpm-x86_64.tar.xz
    2、外网服务器新建conda虚拟环境,python=3.10,安装Transformers、
    maca-pytorch2.6-py310-3.5.3.9-x86_64.tar.xz、maca-vllm-metax-0.19.0-py310-3.5.3.502-linux-x86_64.tar.xz,使用conda pack打包虚拟环境,拷贝到内网服务器,conda-unpack解压
    3、下载cu-bridge的3.5.3版本的包:
    gitee.com/metax-maca/cu-bridge
    编译过程如下:
    unzip 3.1.0.zip
    mv cu-bridge-3.1.0 cu-bridge
    sudo chmod 755 cu-bridge -Rf
    cd cu-bridge
    mkdir build && cd ./build
    cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
    make && make install

    4、下载了:cuda-samples的11.5版本
    github.com/NVIDIA/cuda-samples

    4.1 检查MXMACA编译器是否正确安装:
    which mxcc

    4.2 检查MXMACA运行环境是否正常:
    macainfo

    4.3 检查cu-bridge的环境变量

    4.1——4.3的三个输出结果都在上一个回复帖的日志中。

    5、解压cuda-samples-11.5后
    cd cuda-samples-11.5
    执行编译:
    make_maca

    image.png

    PNG, 148.4 KB, uploaded by LiuWei on 2026年5月28日.

  • Members 521 posts
    2026年5月28日 16:31

    尊敬的开发者您好,cu-bridge测试请编译vectorAdd,按照gitee仓库使用指南里的步骤

  • Members 14 posts
    2026年5月28日 16:40

    抱歉,之前没有找到vectorAdd的位置。
    刚才进入到vectorAdd里,执行:make_maca后,执行成功,生成了:vectorAdd,执行:./vectorAdd 后,输出:Test PASSED Done,应该是说明现在的cu-bridge是没有问题的

  • Members 14 posts
    2026年5月28日 16:54
  • Members 521 posts
    2026年5月29日 11:38

    尊敬的开发者您好,
    一。你目前的做法是直接去编译官方 vLLM,但官方 vLLM 的 setup.py 里没有 MACA 相关的任何逻辑。你需要在 vllm-metax 仓库里执行编译。
    正确的编译顺序是:

    先编译 vllm-metax (提供 MACA 专用算子 _C 、 _moe_C 等)
    再编译官方 vLLM ,但使用 VLLM_TARGET_DEVICE=empty (只用纯 Python 层,算子由 vllm-metax 提供)

    二。cu-bridge 未正确编译
    从 vllm-metax 的 setup.py 可以看到:

    CMAKE_EXECUTABLE = "cmake" if not USE_MACA else "cmake_maca"

    在 MACA 环境下,使用的是 cmake_maca 而不是系统 cmake。 cmake_maca 是 MACA SDK 提供的 cmake 包装器,它在编译 .cu 文件时,会调用 cu-bridge 的 gnu_wrapper 将 CUDA 源码转换为 MACA 设备代码。
    gnu_wrapper 不是 cu-bridge 源码中自带的可执行文件,而是在编译 cu-bridge 时 由 cmake 构建过程自动生成的 。如果你安装 cu-bridge 的方式是直接解压预编译包(而没有在本地执行 cmake + make),那么 gnu_wrapper 就不会生成。
    解决方案
    步骤 1:重新从源码编译 cu-bridge

    下载 cu-bridge 源码(版本号需和你的 MACA SDK 匹配,你的是 3.5.3)
    cd /tmp
    wget gitee.com/metax-maca/cu-bridge/repository/archive/3.5.3.zip
    unzip 3.5.3.zip
    mv cu-bridge-3.5.3 cu-bridge
    chmod 755 cu-bridge -Rf
    cd cu-bridge
    mkdir build && cd build
    cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
    make -j$(nproc) && make install

    编译完成后,确认 /opt/maca/tools/cu-bridge/bin/gnu_wrapper 是否存在:

    ls -la /opt/maca/tools/cu-bridge/bin/gnu_wrapper

    步骤 2:设置 MACA 环境变量

    export MACA_PATH="/opt/maca"
    export CUCC_PATH="${MACA_PATH}/tools/cu-bridge"
    export CUDA_PATH="${HOME}/cu-bridge/CUDA_DIR"
    export CUCC_CMAKE_ENTRY=2

    export PATH=${MACA_PATH}/mxgpu_llvm/bin:${MACA_PATH}/bin:${CUCC_PATH}/tools:${CUCC_PATH}/bin:${PATH}
    export LD_LIBRARY_PATH=${MACA_PATH}/lib:${MACA_PATH}/ompi/lib:${MACA_PATH}/mxgpu_llvm/lib:${LD_LIBRARY_PATH}

    export VLLM_INSTALL_PUNICA_KERNELS=1

    步骤 3:编译 vllm-metax(沐曦插件)

    cd /path/to/vLLM-metax
    python use_existing_metax.py
    pip install -r requirements/build.txt
    pip install . --no-build-isolation

    这里 Maca 版本的 torch 中 torch.utils.cpp_extension 应已包含 MACA_HOME ,这样 USE_MACA=True 就会生效,使用 cmake_maca 编译。
    步骤 4:编译官方 vLLM(空设备模式)示例:

    git clone --depth 1 --branch releases/v0.20.0 github.com/vllm-project/vllm
    cd vllm
    python use_existing_pytorch.py
    pip install -r requirements/build.txt
    VLLM_TARGET_DEVICE=empty pip install . --no-build-isolation

    注意:这里的关键是 VLLM_TARGET_DEVICE=empty ,它让官方 vLLM 跳过所有 GPU 算子的编译(这些算子由 vllm-metax 提供),只安装纯 Python 层的代码。

  • Members 14 posts
    2026年5月29日 22:30

    好的,非常感谢[合十],周一开始按照这个步骤重新执行

  • Members 14 posts
    2026年6月1日 15:54

    你好,
    刚执行了步骤 1:重新从源码编译 cu-bridge,但是执行完步骤一后,发现 ls -la /opt/maca/tools/cu-bridge/bin/gnu_wrapper 依然不存在。
    结论:/opt/maca/tools/cu-bridge/bin/gnu_wrapper不存在,后续编译官方的vllm依然后报上述的错误

    详细执行过程的日志如下:
    1、将原有的/opt/maca/tools/cu-bridge/备份
    (vllm_clone_0.19.0) [root@localhost build]# mv /opt/maca/tools/cu-bridge/ /opt/maca/tools/cu-bridge.bak

    2、查看make和cmake的版本:
    (vllm_clone_0.19.0) [root@localhost build]# make -version
    GNU Make 4.3
    为 x86_64-koji-linux-gnu 编译
    Copyright (C) 1988-2020 Free Software Foundation, Inc.
    许可证:GPLv3+:GNU 通用公共许可证第 3 版或更新版本gnu.org/licenses/gpl.html
    本软件是自由软件:您可以自由修改和重新发布它。
    在法律允许的范围内没有其他保证。

    (vllm_clone_0.19.0) [root@localhost bin]# cmake -version
    cmake version 3.16.5
    CMake suite maintained and supported by Kitware (kitware.com/cmake).

    3、从gitee中下载cu-bridge-3.5.3.zip,大小为:12428135 字节,下载地址:gitee.com/metax-maca/cu-bridge/repository/archive/3.5.3.zip,
    (vllm_clone_0.19.0) [root@localhost opt]# ll
    -rw-r--r-- 1 root root 12428135 6月 1 13:23 cu-bridge-3.5.3.zip

    解压缩:
    (vllm_clone_0.19.0) [root@localhost opt]# unzip cu-bridge-3.5.3.zip

    4、项目重命名
    (vllm_clone_0.19.0) [root@localhost opt]# mv cu-bridge-3.5.3 cu-bridge

    5、修改权限
    (vllm_clone_0.19.0) [root@localhost opt]# chmod 755 cu-bridge -Rf

    6、创建build目录
    (vllm_clone_0.19.0) [root@localhost opt]# cd cu-bridge
    (vllm_clone_0.19.0) [root@localhost cu-bridge]# mkdir build
    (vllm_clone_0.19.0) [root@localhost cu-bridge]# cd build/

    7、执行cmake
    (vllm_clone_0.19.0) [root@localhost build]# cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
    CMake Warning (dev) in CMakeLists.txt:
    No project() command is present. The top-level CMakeLists.txt file must
    contain a literal, direct call to the project() command. Add a line of
    code such as

    project(ProjectName)
    

    near the top of the file, but after cmake_minimum_required().

    CMake is pretending there is a "project(Project)" command on the first
    line.
    This warning is for project developers. Use -Wno-dev to suppress it.

    CMake Warning (dev) in CMakeLists.txt:
    cmake_minimum_required() should be called prior to this top-level project()
    call. Please see the cmake-commands(7) manual for usage documentation of
    both commands.
    This warning is for project developers. Use -Wno-dev to suppress it.

    -- The C compiler identification is GNU 7.3.0
    -- The CXX compiler identification is GNU 7.3.0
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working C compiler: /usr/bin/cc - skipped
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/c++ - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- DEST_DIR:./
    -- MACA_PATH=/opt/maca
    -- PACKAGE_GENERATOR=DEB
    -- CMAKE_TARGET_ARCH=
    -- Ver=[3.5.3.18] in [/opt/maca/Version.txt]
    -- Ver=[3.5.3] in [/opt/cu-bridge/Version.txt]
    -- CMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge
    -- MACA_PATH=/opt/maca
    -- CUPRJ_INCLUDE: /opt/cu-bridge/src/bridge/tools_ext/../../../include/
    -- Configuring done (0.5s)
    -- Generating done (0.0s)
    -- Build files have been written to: /opt/cu-bridge/build

    8、执行make
    具体的make日志见附件:cu-bridge.log,因为回复框有60000的字数限制。

    9、查看/opt/maca/tools/cu-bridge/
    (vllm_clone_0.19.0) [root@localhost build]# ll /opt/maca/tools/cu-bridge/bin
    conf_bazel.json
    conf_gfortran.json
    conf_gnu_bazel.json
    conf_gnu.json
    conf.json
    conf_regex.yaml
    cucc
    gfortran
    gnu
    gomxccbin
    note.md
    process_args.py

    insert_drive_file
    cu-bridge.log

    Text, 164.4 KB, uploaded by LiuWei on 2026年6月1日.

  • Members 521 posts
    2026年6月1日 20:30

    尊敬的开发者您好,请忽略此错误
    执行以下步骤
    配置环境变量

    # setup MACA path
    export MACA_PATH="/opt/maca"
    
    # cu-bridge
    export CUCC_PATH="${MACA_PATH}/tools/cu-bridge"
    export CUDA_PATH="${HOME}/cu-bridge/CUDA_DIR"
    export CUCC_CMAKE_ENTRY=2
    
    # update PATH
    export PATH=${MACA_PATH}/mxgpu_llvm/bin:${MACA_PATH}/bin:${CUCC_PATH}/tools:${CUCC_PATH}/bin:${PATH}
    export LD_LIBRARY_PATH=${MACA_PATH}/lib:${MACA_PATH}/ompi/lib:${MACA_PATH}/mxgpu_llvm/lib:${LD_LIBRARY_PATH}
    

    编译vllm

    # clone vllm repository
    git clone https://github.com/vllm/vllm.git
    cd vllm
    git checkout v0.19.0
    
    python use_existing_pytorch.py
    pip install -r requirements/build.txt
    VLLM_TARGET_DEVICE=empty pip install . --no-build-isolation
    

    编译vLLM-metax

    git clone --branch v0.19.0 https://github.com/MetaX-MACA/vLLM-metax
    cd vLLM-metax
    git checkout v0.19.0
    python use_existing_metax.py
    pip install -r requirements/build.txt
    VLLM_TARGET_DEVICE=empty pip install . --no-build-isolation
    
  • Members 14 posts
    2026年6月2日 09:32

    你好,我上面的描述可能没有很清晰。
    我执行了你提供的步骤1:

    步骤 1:重新从源码编译 cu-bridge
    
    下载 cu-bridge 源码(版本号需和你的 MACA SDK 匹配,你的是 3.5.3)
    cd /tmp
    wget gitee.com/metax-maca/cu-bridge/repository/archive/3.5.3.zip
    unzip 3.5.3.zip
    mv cu-bridge-3.5.3 cu-bridge
    chmod 755 cu-bridge -Rf
    cd cu-bridge
    mkdir build && cd build
    cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../
    make -j$(nproc) && make install
    
    

    编译完成后, /opt/maca/tools/cu-bridge/bin/gnu_wrapper 依然不存在,所以后续编译vllm还是会报缺少gnu_wrapper的错误吧

  • Members 521 posts
    2026年6月2日 09:33

    尊敬的开发者您好,您是在vllm编译还是vllm-metax编译

  • Members 14 posts
    2026年6月2日 09:36
  • Members 521 posts
    2026年6月2日 09:38

    尊敬的开发者您好,忽略报错,继续执行后续步骤尝试

  • Members 14 posts
    2026年6月2日 14:06

    你好,我已完成你上方帖中的步骤1和2:
    步骤 1:重新从源码编译 cu-bridge(但是没有生成 /opt/maca/tools/cu-bridge/bin/gnu_wrapper)
    步骤 2:设置 MACA 环境变量

    问题:在执行步骤 3:编译 vllm-metax(沐曦插件)
    的命令:
    VLLM_TARGET_DEVICE=empty pip install . --no-build-isolation
    后,报错如截图:

    117ff143dfe6934c82f475fbff43c9d7.jpg

    JPG, 800.0 KB, uploaded by LiuWei on 2026年6月2日.

    c02f633e16659354088f9f93575dbaf7.jpg

    JPG, 1007.2 KB, uploaded by LiuWei on 2026年6月2日.

  • Members 521 posts
    2026年6月2日 14:13

    尊敬的开发者您好,请检查pip源设置