localhost:80799:95212 [5] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net_ib.cc:1750 MCCL WARN NET/IB : Got completion from peer 192.168.1.205<56914> with error 12, opcode 0, len 0, vendor err 129 localhost:80799:95212 [5] MCCL INFO /workspace/communication/mccl/src/include/net.h:89 -> 6 localhost:80799:95212 [5] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:1388 -> 6 localhost:80799:95212 [5] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:582 -> 6 localhost:80799:95212 [5] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:757 -> 6 [Proxy Thread] ==> /root/mxlog/mccl/mccl.80954.2026_03_24_12_31_16.log <== localhost:80954:95032 [6] MCCL INFO Channel 20/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 21/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 22/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 23/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 24/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 25/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 26/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 27/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 28/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 29/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 30/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 31/0 : 6[5b000] -> 7[5e000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 00/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 01/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 02/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 03/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 04/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 05/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 06/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 07/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 08/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 09/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 10/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 11/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 12/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 13/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 14/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 15/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 16/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 17/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 18/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 19/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 20/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 21/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 22/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 23/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 24/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 25/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 26/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 27/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 28/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 29/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 30/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Channel 31/0 : 6[5b000] -> 5[56000] via P2P/IPC comm 0x55c7824c6070 nRanks 08 localhost:80954:95032 [6] MCCL INFO Connected all trees localhost:80954:80954 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:80954 [6] MCCL INFO MCCL version 2.16.5 localhost:80954:80954 [6] MCCL INFO Using network IB localhost:80954:80954 [6] MCCL INFO comm=0x55c7834ed9e0, lastStream is initialized localhost:80954:80954 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:80954 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:80954 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:80954 [6] MCCL INFO rank 0 cudeDev 6, local 3 socket 3 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:80954:80954 [6] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:80954:80954 [6] MCCL INFO Setting affinity for GPU 6 to ffff,00000000,00000000,00000000,0000ffff localhost:80954:80954 [6] MCCL INFO MCCL pcie buffer mode 0, rcCount 1, nodeType 0x0, minAp 104 localhost:80954:80954 [6] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:80954:80954 [6] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:80954:80954 [6] MCCL INFO Clique kernels disabled localhost:80954:80954 [6] MCCL INFO Channel 00/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 01/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 02/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 03/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 04/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 05/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 06/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 07/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 08/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 09/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 10/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 11/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 12/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 13/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 14/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 15/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 16/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 17/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 18/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 19/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 20/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 21/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 22/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 23/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 24/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 25/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 26/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 27/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 28/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 29/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 30/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Channel 31/32 : 0 1 localhost:80954:80954 [6] MCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] -1/-1/-1->0->1 [3] -1/-1/-1->0->1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] -1/-1/-1->0->1 [11] -1/-1/-1->0->1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 [16] 1/-1/-1->0->-1 [17] 1/-1/-1->0->-1 [18] -1/-1/-1->0->1 [19] -1/-1/-1->0->1 [20] 1/-1/-1->0->-1 [21] 1/-1/-1->0->-1 [22] -1/-1/-1->0->1 [23] -1/-1/-1->0->1 [24] 1/-1/-1->0->-1 [25] 1/-1/-1->0->-1 [26] -1/-1/-1->0->1 [27] -1/-1/-1->0->1 [28] 1/-1/-1->0->-1 [29] 1/-1/-1->0->-1 [30] -1/-1/-1->0->1 [31] -1/-1/-1->0->1 comm 0x55c7834ed9e0 nRanks 02 busId 5b000 localhost:80954:80954 [6] MCCL INFO P2P Chunksize set to 131072 localhost:80954:80954 [6] MCCL INFO disable groupWriteback localhost:80954:80954 [6] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:80954:80954 [6] MCCL INFO 32 coll channels, 32 p2p channels, 32 p2p channels per peer localhost:80954:80954 [6] MCCL INFO comm 0x55c7834ed9e0 rank 0 nranks 2 macaDev 6 busId 5b000 localSize 0 used 67228512 bytes - Init COMPLETE localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 00/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 01/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 02/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 03/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 04/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 05/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 06/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 07/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 08/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 09/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 10/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 11/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 12/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 13/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 14/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 15/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 16/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 17/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 18/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 19/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 20/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 21/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 22/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 23/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 24/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 25/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 26/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 27/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 28/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 29/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 30/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95213 [6] MCCL INFO ip=192.168.1.204, port=0 localhost:80954:95225 [6] MCCL INFO Channel 31/0 : 1[5b000] -> 0[5b000] [receive] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 00/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 01/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 02/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 03/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 04/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 05/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 06/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 07/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 08/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 09/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 10/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 11/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 12/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 13/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 14/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 15/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 16/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 17/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 18/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 19/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 20/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 21/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 22/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 23/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 24/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 25/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 26/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 27/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 28/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 29/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 30/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Channel 31/0 : 0[5b000] -> 1[5b000] [send] via NET/IB/1/GDRDMA comm 0x55c7834ed9e0 nRanks 02 localhost:80954:95225 [6] MCCL INFO Connected all trees localhost:80954:95218 [6] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net_ib.cc:1750 MCCL WARN NET/IB : Got completion from peer 192.168.1.205<41930> with error 12, opcode 0, len 0, vendor err 129 localhost:80954:95218 [6] MCCL INFO /workspace/communication/mccl/src/include/net.h:89 -> 6 localhost:80954:95218 [6] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:1388 -> 6 localhost:80954:95218 [6] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:582 -> 6 localhost:80954:95218 [6] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:757 -> 6 [Proxy Thread]