• Members 9 posts
    2026年2月27日 09:03

    一、软硬件信息
    1.服务器厂家: 浪潮
    2.沐曦GPU型号: C500
    3.操作系统内核版本:6.6.71
    4.是否开启CPU虚拟化: 开启
    5.mx-smi回显:
    mx-smi version: 2.2.4

    =================== MetaX System Management Interface Log ===================
    Timestamp : Fri Feb 27 08:59:20 2026

    Attached GPUs : 4
    +---------------------------------------------------------------------------------+
    | MX-SMI 2.2.4 Kernel Mode Driver Version: 2.12.0 |
    | MACA Version: 2.33.0.12 BIOS Version: 1.18.2.0* |
    |------------------------------------+---------------------+----------------------+
    | GPU NAME | Bus-id | GPU-Util |
    | Temp Pwr:Usage/Cap | Memory-Usage | GPU-State |
    |====================================+=====================+======================|
    | 0 MetaX C500 | 0000:85:00.0 | 0% |
    | 35C 30W / 350W | 858/65536 MiB | Available |
    +------------------------------------+---------------------+----------------------+
    | 1 MetaX C500 | 0000:b1:00.0 | 0% |
    | 35C 31W / NA | 858/65536 MiB | Available |
    +------------------------------------+---------------------+----------------------+
    | 2 MetaX C500 | 0000:c7:00.0 | 0% |
    | 36C 30W / 350W | 858/65536 MiB | Available |
    +------------------------------------+---------------------+----------------------+
    | 3 MetaX C500 | 0000:dd:00.0 | 0% |
    | 33C 28W / NA | 858/65536 MiB | Available |
    +------------------------------------+---------------------+----------------------+

    +---------------------------------------------------------------------------------+
    | Process: |
    | GPU PID Process Name GPU Memory |
    | Usage(MiB) |
    |=================================================================================|
    | no process found |
    +---------------------------------------------------------------------------------+

    End of Log
    二、问题现象
    P2P场景中,GPU 0核函数对GPU 1上HBM内存通过指针写入,GPU 1轮询读取该地址判断是否收到数据。经测试发现存在两个问题:GPU 0数据保存在其L2 cache中,未推入GPU 1 HBM;GPU 1轮询旧数据保存其L2 cache中,无法感知其HBM数据和更新。请问是否有缓存失效手段?

  • Members 294 posts
    2026年2月27日 11:00

    尊敬的开发者您好,您裸机驱动固件版本较老,请升级最新版本

  • arrow_forward

    Thread has been moved from 公共.

  • Members 9 posts
    2026年3月6日 13:42

    你好,使用新版本驱动后问题依然存在。
    mx-smi version: 2.1.10

    =================== MetaX System Management Interface Log ===================
    Timestamp : Fri Mar 6 13:40:47 2026

    Attached GPUs : 8
    +---------------------------------------------------------------------------------+
    | MX-SMI 2.1.10 Kernel Mode Driver Version: 3.3.12 |
    | MACA Version: 3.2.1.10 BIOS Version: 1.20.3.0 |
    |------------------------------------+---------------------+----------------------+
    | GPU NAME | Bus-id | GPU-Util |
    | Temp Pwr:Usage/Cap | Memory-Usage | |
    |====================================+=====================+======================|
    | 0 MetaX C550 | 0000:0f:00.0 | 0% |
    | 35C 96W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+
    | 1 MetaX C550 | 0000:34:00.0 | 0% |
    | 38C 95W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+
    | 2 MetaX C550 | 0000:48:00.0 | 0% |
    | 38C 96W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+
    | 3 MetaX C550 | 0000:5a:00.0 | 0% |
    | 37C 97W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+
    | 4 MetaX C550 | 0000:87:00.0 | 0% |
    | 35C 93W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+
    | 5 MetaX C550 | 0000:ae:00.0 | 0% |
    | 39C 96W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+
    | 6 MetaX C550 | 0000:c2:00.0 | 0% |
    | 39C 95W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+
    | 7 MetaX C550 | 0000:d7:00.0 | 0% |
    | 38C 99W / 450W | 858/65536 MiB | |
    +------------------------------------+---------------------+----------------------+

    +---------------------------------------------------------------------------------+
    | Process: |
    | GPU PID Process Name GPU Memory |
    | Usage(MiB) |
    |=================================================================================|
    | no process found |
    +---------------------------------------------------------------------------------+
    两个GPU上核函数伪代码如下
    global void setV(float *ptr)
    {
    float val = 3.3f;
    int r = 10;

    float *peer_ptr = ptr;
    
    if (idx == 0) {
        store_with_flush<float>(peer_ptr, val);
    }
    asm volatile("wb_l2\n");
    asm volatile("arrive 0\n");
    __threadfence_system();
    if (idx == 0)
        printf("in GPU setV %.3f %.3f\n", val, load_uncached<float>(peer_ptr));
    while (r-- > 0 && idx == 0) {
        __nanosleep(1000000000);
    }
    

    }

    global void printfV(float *ptr)
    {
    int r = 10;
    while (true && idx == 0 && r-- > 0) {
    asm volatile("wb_l2\n");
    asm volatile("arrive 0\n");
    __threadfence_system();
    printf("in GPU printf %.3f\n", load_uncached<float>(ptr));
    __nanosleep(1000000000);
    }
    printf("current threadIdx %d\n", idx);
    }

  • Members 9 posts
    2026年3月6日 13:46

    另请教__threadfence_system()的使用场景具体是什么?该场景下好像没有生效

  • Members 294 posts
    2026年3月6日 13:53

    尊敬的开发者您好,请联系商务接口人咨询此问题

  • arrow_forward

    Thread has been moved from 解决中.

  • Members 9 posts
    2026年3月12日 15:35

    你们是只管卖卡,不管生态吗。Nvidia论坛上都一直有工程师给解决各类问题,怎么到你们这里,就必须得“联系商务接口人”,那这个论坛存在的意义是什么呢?

    跨GPU写入可见性,这样基础的功能如何实现都不明确,怎么才能把你们的卡用起来?还是说你们默认客户都没有开发能力,直接调用你们编译好的东西就行了吗。

  • Members 294 posts
    2026年3月12日 15:39

    尊敬的开发者您好,请联系商务接口人咨询此问题,买卡客户有售后保障,有售后团队处理您的问题。

  • Members 8 posts
    2026年3月12日 17:48

    可以使用mcExtMallocWithFlags试试 (/opt/maca/include/mcr/mc_runtime_api.h有这个API的功能介绍)

  • Members 294 posts
    2026年3月12日 17:53

    尊敬的开发者您好,请您开启个人主题提供项目背景等详细信息进行沟通。详见附件。

    image.png

    PNG, 223.4 KB, uploaded by shuai_chen on 2026年3月12日.

    image.png

    PNG, 48.7 KB, uploaded by shuai_chen on 2026年3月12日.