][root@localhost /opt/maca/samples/mccl_tests/perf]# ./cluster.sh  Use the default ip addr. Run with parameters for custom ip addr, for example: bash cluster.sh ip_1:proc_count,ip_2:proc_count gpu_num test_name The test is all_reduce_perf, the maca version is /opt/maca-3.5.3   main_process = 3808851 =============================== # nThread 1 nGpus 1 minBytes 1024 maxBytes 1073741824 step: 2(factor) warmup iters: 5 iters: 10 agg iters: 1 validation: 1 graph: 0 # # Using devices #   Rank  0 Pid 3808851 on  localhost device  0 [0x05] MetaX C500 #   Rank  1 Pid 3808852 on  localhost device  1 [0x0b] MetaX C500 #   Rank  2 Pid 3808853 on  localhost device  2 [0x0e] MetaX C500 #   Rank  3 Pid 3808854 on  localhost device  3 [0x0f] MetaX C500 #   Rank  4 Pid 3808855 on  localhost device  4 [0x55] MetaX C500 #   Rank  5 Pid 3808857 on  localhost device  5 [0x56] MetaX C500 #   Rank  6 Pid 3808858 on  localhost device  6 [0x5b] MetaX C500 #   Rank  7 Pid 3808864 on  localhost device  7 [0x5e] MetaX C500 #   Rank  8 Pid 1542590 on  localhost device  0 [0x05] MetaX C500 #   Rank  9 Pid 1542591 on  localhost device  1 [0x0b] MetaX C500 #   Rank 10 Pid 1542592 on  localhost device  2 [0x0e] MetaX C500 #   Rank 11 Pid 1542593 on  localhost device  3 [0x0f] MetaX C500 #   Rank 12 Pid 1542594 on  localhost device  4 [0x55] MetaX C500 #   Rank 13 Pid 1542595 on  localhost device  5 [0x56] MetaX C500 #   Rank 14 Pid 1542596 on  localhost device  6 [0x5b] MetaX C500 #   Rank 15 Pid 1542597 on  localhost device  7 [0x5e] MetaX C500 localhost:3808851:3808851 [0] MCCL INFO /root/mxlog/mccl/mccl.3808851.2026_03_23_15_58_46.log localhost:3808851:3808851 [0] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808851:3808851 [0] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808851:3808851 [0] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808851:3808851 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:1542597:1542597 [7] MCCL INFO /root/mxlog/mccl/mccl.1542597.2026_03_23_16_01_30.log localhost:1542590:1542590 [0] MCCL INFO /root/mxlog/mccl/mccl.1542590.2026_03_23_16_01_30.log localhost:1542593:1542593 [3] MCCL INFO /root/mxlog/mccl/mccl.1542593.2026_03_23_16_01_30.log localhost:3808851:3808851 [0] MCCL INFO MCCL version 2.16.5 localhost:1542597:1542597 [7] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:1542596:1542596 [6] MCCL INFO /root/mxlog/mccl/mccl.1542596.2026_03_23_16_01_30.log localhost:1542594:1542594 [4] MCCL INFO /root/mxlog/mccl/mccl.1542594.2026_03_23_16_01_30.log localhost:1542595:1542595 [5] MCCL INFO /root/mxlog/mccl/mccl.1542595.2026_03_23_16_01_30.log localhost:1542592:1542592 [2] MCCL INFO /root/mxlog/mccl/mccl.1542592.2026_03_23_16_01_30.log localhost:3808855:3808855 [4] MCCL INFO /root/mxlog/mccl/mccl.3808855.2026_03_23_15_58_46.log localhost:1542591:1542591 [1] MCCL INFO /root/mxlog/mccl/mccl.1542591.2026_03_23_16_01_30.log localhost:1542597:1542597 [7] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542597:1542597 [7] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808857:3808857 [5] MCCL INFO /root/mxlog/mccl/mccl.3808857.2026_03_23_15_58_46.log localhost:3808858:3808858 [6] MCCL INFO /root/mxlog/mccl/mccl.3808858.2026_03_23_15_58_46.log localhost:3808864:3808864 [7] MCCL INFO /root/mxlog/mccl/mccl.3808864.2026_03_23_15_58_46.log localhost:1542590:1542590 [0] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:1542593:1542593 [3] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808854:3808854 [3] MCCL INFO /root/mxlog/mccl/mccl.3808854.2026_03_23_15_58_46.log localhost:3808852:3808852 [1] MCCL INFO /root/mxlog/mccl/mccl.3808852.2026_03_23_15_58_46.log localhost:3808853:3808853 [2] MCCL INFO /root/mxlog/mccl/mccl.3808853.2026_03_23_15_58_46.log localhost:1542596:1542596 [6] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:1542595:1542595 [5] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:1542594:1542594 [4] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:1542592:1542592 [2] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808855:3808855 [4] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:1542591:1542591 [1] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808858:3808858 [6] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808857:3808857 [5] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808854:3808854 [3] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808853:3808853 [2] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808852:3808852 [1] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:3808864:3808864 [7] MCCL INFO MCCL_SOCKET_IFNAME set to p50p1,p51p1 localhost:1542590:1542590 [0] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542590:1542590 [0] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542593:1542593 [3] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542593:1542593 [3] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542594:1542594 [4] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542594:1542594 [4] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542596:1542596 [6] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542596:1542596 [6] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542597:1542724 [7] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542595:1542595 [5] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542595:1542595 [5] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542592:1542592 [2] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542592:1542592 [2] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542591:1542591 [1] MCCL INFO Bootstrap : Using p51p1:192.168.100.14<0> localhost:1542591:1542591 [1] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542590:1542725 [0] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542592:1542731 [2] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542593:1542726 [3] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542594:1542728 [4] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542596:1542727 [6] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542591:1542730 [1] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542595:1542729 [5] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:3808851:3809571 [0] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542597:1542724 [7] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808855:3808855 [4] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808855:3808855 [4] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808854:3808854 [3] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808858:3808858 [6] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808857:3808857 [5] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808853:3808853 [2] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808852:3808852 [1] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808864:3808864 [7] MCCL INFO Bootstrap : Using p51p1:192.168.100.13<0> localhost:3808858:3808858 [6] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808857:3808857 [5] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808853:3808853 [2] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808852:3808852 [1] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808854:3808854 [3] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:3808864:3808864 [7] MCCL INFO NUMA auto balancing enabled which can lead to variability in the MCCL performance! Disable by "sudo sysctl kernel.numa_balancing=0" localhost:1542596:1542727 [6] MCCL INFO Not Supported NIC: rocep6s0 localhost:1542594:1542728 [4] MCCL INFO Not Supported NIC: rocep6s0 localhost:1542597:1542724 [7] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542597:1542724 [7] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542597:1542724 [7] MCCL INFO Using network IB localhost:1542597:1542724 [7] MCCL INFO comm=0x7f75cdd40010, lastStream is initialized localhost:1542597:1542724 [7] MCCL INFO ip=192.168.100.14, port=0 localhost:1542597:1542724 [7] MCCL INFO ip=192.168.100.14, port=0 localhost:1542594:1542728 [4] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542594:1542728 [4] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542594:1542728 [4] MCCL INFO Using network IB localhost:1542594:1542728 [4] MCCL INFO comm=0x7fef53940010, lastStream is initialized localhost:1542590:1542725 [0] MCCL INFO Not Supported NIC: rocep6s0 localhost:1542591:1542730 [1] MCCL INFO Not Supported NIC: rocep6s0 localhost:1542594:1542728 [4] MCCL INFO ip=192.168.100.14, port=0 localhost:1542594:1542728 [4] MCCL INFO ip=192.168.100.14, port=0 localhost:1542595:1542729 [5] MCCL INFO Not Supported NIC: rocep6s0 localhost:1542592:1542731 [2] MCCL INFO Not Supported NIC: rocep6s0 localhost:1542593:1542726 [3] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808851:3809571 [0] MCCL INFO Not Supported NIC: rocep6s0 localhost:1542590:1542725 [0] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542590:1542725 [0] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542590:1542725 [0] MCCL INFO Using network IB localhost:1542596:1542727 [6] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542596:1542727 [6] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542596:1542727 [6] MCCL INFO Using network IB localhost:1542590:1542725 [0] MCCL INFO comm=0x7fced0b40010, lastStream is initialized localhost:1542595:1542729 [5] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542595:1542729 [5] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542595:1542729 [5] MCCL INFO Using network IB localhost:1542596:1542727 [6] MCCL INFO comm=0x7efc82c24010, lastStream is initialized localhost:1542590:1542725 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542725 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542593:1542726 [3] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542592:1542731 [2] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542595:1542729 [5] MCCL INFO comm=0x7f93ef76a010, lastStream is initialized localhost:1542593:1542726 [3] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542593:1542726 [3] MCCL INFO Using network IB localhost:1542592:1542731 [2] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542592:1542731 [2] MCCL INFO Using network IB localhost:1542596:1542727 [6] MCCL INFO ip=192.168.100.14, port=0 localhost:1542596:1542727 [6] MCCL INFO ip=192.168.100.14, port=0 localhost:1542595:1542729 [5] MCCL INFO ip=192.168.100.14, port=0 localhost:1542595:1542729 [5] MCCL INFO ip=192.168.100.14, port=0 localhost:1542592:1542731 [2] MCCL INFO comm=0x7fe15ae00010, lastStream is initialized localhost:1542593:1542726 [3] MCCL INFO comm=0x7f4964024010, lastStream is initialized localhost:1542592:1542731 [2] MCCL INFO ip=192.168.100.14, port=0 localhost:1542592:1542731 [2] MCCL INFO ip=192.168.100.14, port=0 localhost:1542593:1542726 [3] MCCL INFO ip=192.168.100.14, port=0 localhost:1542593:1542726 [3] MCCL INFO ip=192.168.100.14, port=0 localhost:3808855:3809573 [4] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:3808854:3809576 [3] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:3808858:3809574 [6] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:3808857:3809577 [5] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:3808853:3809578 [2] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:3808852:3809575 [1] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:3808864:3809579 [7] MCCL INFO MCCL_IB_HCA set to rocep6s0,rocep95s0 localhost:1542591:1542730 [1] MCCL INFO Not Supported NIC: rocep95s0 localhost:1542591:1542730 [1] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.14<0> localhost:1542591:1542730 [1] MCCL INFO Using network IB localhost:1542591:1542730 [1] MCCL INFO comm=0x7f1e96c00010, lastStream is initialized localhost:1542591:1542730 [1] MCCL INFO ip=192.168.100.14, port=0 localhost:1542591:1542730 [1] MCCL INFO ip=192.168.100.14, port=0 localhost:3808851:3809571 [0] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808851:3809571 [0] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808851:3809571 [0] MCCL INFO Using network IB localhost:3808851:3809571 [0] MCCL INFO comm=0x7feed6140010, lastStream is initialized localhost:3808851:3809571 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809571 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808855:3809573 [4] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808854:3809576 [3] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808853:3809578 [2] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808858:3809574 [6] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808864:3809579 [7] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808852:3809575 [1] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808857:3809577 [5] MCCL INFO Not Supported NIC: rocep6s0 localhost:3808858:3809574 [6] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808858:3809574 [6] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808858:3809574 [6] MCCL INFO Using network IB localhost:3808864:3809579 [7] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808864:3809579 [7] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808864:3809579 [7] MCCL INFO Using network IB localhost:3808858:3809574 [6] MCCL INFO comm=0x7fcf8a717010, lastStream is initialized localhost:3808864:3809579 [7] MCCL INFO comm=0x7f14a6340010, lastStream is initialized localhost:3808858:3809574 [6] MCCL INFO ip=192.168.100.13, port=0 localhost:3808858:3809574 [6] MCCL INFO ip=192.168.100.13, port=0 localhost:3808864:3809579 [7] MCCL INFO ip=192.168.100.13, port=0 localhost:3808864:3809579 [7] MCCL INFO ip=192.168.100.13, port=0 localhost:3808854:3809576 [3] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808855:3809573 [4] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808854:3809576 [3] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808854:3809576 [3] MCCL INFO Using network IB localhost:3808855:3809573 [4] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808855:3809573 [4] MCCL INFO Using network IB localhost:3808853:3809578 [2] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808853:3809578 [2] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808853:3809578 [2] MCCL INFO Using network IB localhost:3808854:3809576 [3] MCCL INFO comm=0x7f10c1b6a010, lastStream is initialized localhost:3808855:3809573 [4] MCCL INFO comm=0x7fb633d40010, lastStream is initialized localhost:3808853:3809578 [2] MCCL INFO comm=0x7fc704e00010, lastStream is initialized localhost:3808855:3809573 [4] MCCL INFO ip=192.168.100.13, port=0 localhost:3808854:3809576 [3] MCCL INFO ip=192.168.100.13, port=0 localhost:3808854:3809576 [3] MCCL INFO ip=192.168.100.13, port=0 localhost:3808855:3809573 [4] MCCL INFO ip=192.168.100.13, port=0 localhost:3808852:3809575 [1] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808852:3809575 [1] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808852:3809575 [1] MCCL INFO Using network IB localhost:3808853:3809578 [2] MCCL INFO ip=192.168.100.13, port=0 localhost:3808853:3809578 [2] MCCL INFO ip=192.168.100.13, port=0 localhost:3808852:3809575 [1] MCCL INFO comm=0x7fd9b9c24010, lastStream is initialized localhost:3808857:3809577 [5] MCCL INFO Not Supported NIC: rocep95s0 localhost:3808857:3809577 [5] MCCL INFO NET/IB : Using [0]rocep6s0:1/RoCE [1]rocep95s0:1/RoCE ; OOB p51p1:192.168.100.13<0> localhost:3808857:3809577 [5] MCCL INFO Using network IB localhost:3808852:3809575 [1] MCCL INFO ip=192.168.100.13, port=0 localhost:3808852:3809575 [1] MCCL INFO ip=192.168.100.13, port=0 localhost:3808857:3809577 [5] MCCL INFO comm=0x7f3c2ee24010, lastStream is initialized localhost:3808857:3809577 [5] MCCL INFO ip=192.168.100.13, port=0 localhost:3808857:3809577 [5] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809571 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808852:3809575 [1] MCCL INFO ip=192.168.100.13, port=0 localhost:3808853:3809578 [2] MCCL INFO ip=192.168.100.13, port=0 localhost:3808854:3809576 [3] MCCL INFO ip=192.168.100.13, port=0 localhost:3808857:3809577 [5] MCCL INFO ip=192.168.100.13, port=0 localhost:3808855:3809573 [4] MCCL INFO ip=192.168.100.13, port=0 localhost:3808858:3809574 [6] MCCL INFO ip=192.168.100.13, port=0 localhost:3808864:3809579 [7] MCCL INFO ip=192.168.100.13, port=0 localhost:1542596:1542727 [6] MCCL INFO ip=192.168.100.14, port=0 localhost:1542597:1542724 [7] MCCL INFO ip=192.168.100.14, port=0 localhost:1542595:1542729 [5] MCCL INFO ip=192.168.100.14, port=0 localhost:1542593:1542726 [3] MCCL INFO ip=192.168.100.14, port=0 localhost:1542591:1542730 [1] MCCL INFO ip=192.168.100.14, port=0 localhost:1542592:1542731 [2] MCCL INFO ip=192.168.100.14, port=0 localhost:1542594:1542728 [4] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542725 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:3808854:3809576 [3] MCCL INFO ip=192.168.100.13, port=0 localhost:3808852:3809575 [1] MCCL INFO ip=192.168.100.13, port=0 localhost:3808853:3809578 [2] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809571 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808857:3809577 [5] MCCL INFO ip=192.168.100.13, port=0 localhost:3808858:3809574 [6] MCCL INFO ip=192.168.100.13, port=0 localhost:3808855:3809573 [4] MCCL INFO ip=192.168.100.13, port=0 localhost:3808864:3809579 [7] MCCL INFO ip=192.168.100.13, port=0 localhost:1542593:1542726 [3] MCCL INFO ip=192.168.100.14, port=0 localhost:1542592:1542731 [2] MCCL INFO ip=192.168.100.14, port=0 localhost:3808852:3809575 [1] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542595:1542729 [5] MCCL INFO ip=192.168.100.14, port=0 localhost:1542591:1542730 [1] MCCL INFO ip=192.168.100.14, port=0 localhost:1542596:1542727 [6] MCCL INFO ip=192.168.100.14, port=0 localhost:1542594:1542728 [4] MCCL INFO ip=192.168.100.14, port=0 localhost:3808853:3809578 [2] MCCL INFO RAS client listening socket at ::1<28028> localhost:3808857:3809577 [5] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542597:1542724 [7] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542725 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542593:1542726 [3] MCCL INFO RAS client listening socket at ::1<28028> localhost:3808854:3809576 [3] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542592:1542731 [2] MCCL INFO RAS client listening socket at ::1<28028> localhost:3808855:3809573 [4] MCCL INFO RAS client listening socket at ::1<28028> localhost:3808851:3809571 [0] MCCL INFO RAS client listening socket at ::1<28028> localhost:3808864:3809579 [7] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542591:1542730 [1] MCCL INFO RAS client listening socket at ::1<28028> localhost:3808858:3809574 [6] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542595:1542729 [5] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542596:1542727 [6] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542590:1542725 [0] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542594:1542728 [4] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542597:1542724 [7] MCCL INFO RAS client listening socket at ::1<28028> localhost:1542596:1542727 [6] MCCL INFO rank 14 cudeDev 6, local 3 socket 3 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542596:1542727 [6] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542595:1542729 [5] MCCL INFO rank 13 cudeDev 5, local 0 socket 0 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542595:1542729 [5] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542597:1542724 [7] MCCL INFO rank 15 cudeDev 7, local 1 socket 1 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542597:1542724 [7] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808852:3809575 [1] MCCL INFO rank 1 cudeDev 1, local 2 socket 2 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808852:3809575 [1] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808857:3809577 [5] MCCL INFO rank 5 cudeDev 5, local 0 socket 0 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808853:3809578 [2] MCCL INFO rank 2 cudeDev 2, local 3 socket 3 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808853:3809578 [2] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808864:3809579 [7] MCCL INFO rank 7 cudeDev 7, local 1 socket 1 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808864:3809579 [7] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808854:3809576 [3] MCCL INFO rank 3 cudeDev 3, local 1 socket 1 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808854:3809576 [3] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808857:3809577 [5] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808858:3809574 [6] MCCL INFO rank 6 cudeDev 6, local 3 socket 3 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808858:3809574 [6] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542593:1542726 [3] MCCL INFO rank 11 cudeDev 3, local 1 socket 1 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542593:1542726 [3] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808851:3809571 [0] MCCL INFO rank 0 cudeDev 0, local 0 socket 0 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808851:3809571 [0] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:3808855:3809573 [4] MCCL INFO rank 4 cudeDev 4, local 2 socket 2 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:3808855:3809573 [4] MCCL INFO type:0, uuid: local 0xeeed1921, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542594:1542728 [4] MCCL INFO rank 12 cudeDev 4, local 2 socket 2 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542594:1542728 [4] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542591:1542730 [1] MCCL INFO rank 9 cudeDev 1, local 2 socket 2 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542591:1542730 [1] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542590:1542725 [0] MCCL INFO rank 8 cudeDev 0, local 0 socket 0 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542590:1542725 [0] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542592:1542731 [2] MCCL INFO rank 10 cudeDev 2, local 3 socket 3 peer -1, fabricId 0 index 0 directLinkGroup 0 totalGpu 0, switchId 0 localhost:1542592:1542731 [2] MCCL INFO type:0, uuid: local 0x47ef1210, remote 0xd54f50f9 0x0, topology 2, isa: 10.0 localhost:1542596:1542727 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542596:1542727 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542596:1542727 [6] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:1542595:1542729 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542595:1542729 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542595:1542729 [5] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808855:3809573 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808854:3809576 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808855:3809573 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808854:3809576 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542591:1542730 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542591:1542730 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808855:3809573 [4] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:1542597:1542724 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808854:3809576 [3] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:1542597:1542724 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808864:3809579 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542591:1542730 [1] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808858:3809574 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808864:3809579 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542597:1542724 [7] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808858:3809574 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542593:1542726 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542592:1542731 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542594:1542728 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542590:1542725 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542593:1542726 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808864:3809579 [7] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:1542592:1542731 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542594:1542728 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542590:1542725 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808858:3809574 [6] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:1542593:1542726 [3] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808857:3809577 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542592:1542731 [2] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808857:3809577 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808851:3809571 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808853:3809578 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808852:3809575 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542590:1542725 [0] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:1542594:1542728 [4] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808851:3809571 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808852:3809575 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808853:3809578 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808857:3809577 [5] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808853:3809578 [2] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808851:3809571 [0] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:3808852:3809575 [1] MCCL INFO MCCL_P2P_LEVEL set by environment to LOC localhost:1542594:1542728 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808854:3809576 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542596:1542727 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542595:1542729 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542594:1542728 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808853:3809578 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808854:3809576 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542597:1542724 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542596:1542727 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542595:1542729 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542593:1542726 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808853:3809578 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542597:1542724 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808858:3809574 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542594:1542728 [4] MCCL INFO Setting affinity for GPU 4 to ffff,00000000,00000000,00000000,0000ffff localhost:3808855:3809573 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808854:3809576 [3] MCCL INFO Setting affinity for GPU 3 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:1542593:1542726 [3] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808851:3809571 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808852:3809575 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808857:3809577 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808858:3809574 [6] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542591:1542730 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808864:3809579 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542596:1542727 [6] MCCL INFO Setting affinity for GPU 6 to ffff,00000000,00000000,00000000,0000ffff localhost:1542595:1542729 [5] MCCL INFO Setting affinity for GPU 5 to ffff,00000000,00000000,00000000,0000ffff localhost:3808855:3809573 [4] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542592:1542731 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:3808853:3809578 [2] MCCL INFO Setting affinity for GPU 2 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:1542597:1542724 [7] MCCL INFO Setting affinity for GPU 7 to ffff,00000000,00000000,00000000,0000ffff localhost:1542590:1542725 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 0 'rocep6s0' localhost:1542591:1542730 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542592:1542731 [2] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542590:1542725 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808851:3809571 [0] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808852:3809575 [1] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542593:1542726 [3] MCCL INFO Setting affinity for GPU 3 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:3808857:3809577 [5] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:3808864:3809579 [7] MCCL INFO NET/IB : GPU Direct RDMA Disabled for HCA 1 'rocep95s0' localhost:1542591:1542730 [1] MCCL INFO Setting affinity for GPU 1 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:3808858:3809574 [6] MCCL INFO Setting affinity for GPU 6 to ffff,00000000,00000000,00000000,0000ffff localhost:3808855:3809573 [4] MCCL INFO Setting affinity for GPU 4 to ffff,00000000,00000000,00000000,0000ffff localhost:3808851:3809571 [0] MCCL INFO Setting affinity for GPU 0 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:3808852:3809575 [1] MCCL INFO Setting affinity for GPU 1 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:1542590:1542725 [0] MCCL INFO Setting affinity for GPU 0 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:3808857:3809577 [5] MCCL INFO Setting affinity for GPU 5 to ffff,00000000,00000000,00000000,0000ffff localhost:1542592:1542731 [2] MCCL INFO Setting affinity for GPU 2 to ffff0000,00000000,00000000,00000000,ffff0000,00000000 localhost:3808864:3809579 [7] MCCL INFO Setting affinity for GPU 7 to ffff,00000000,00000000,00000000,0000ffff localhost:1542593:1542726 [3] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542592:1542731 [2] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542590:1542725 [0] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542595:1542729 [5] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542591:1542730 [1] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542596:1542727 [6] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808855:3809573 [4] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808858:3809574 [6] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808857:3809577 [5] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808854:3809576 [3] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808864:3809579 [7] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542594:1542728 [4] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808852:3809575 [1] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808853:3809578 [2] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:3808851:3809571 [0] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542597:1542724 [7] MCCL INFO MCCL pcie buffer mode 0, rcCount 2, nodeType 0x0, minAp 104 localhost:1542597:1542724 [7] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542597:1542724 [7] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542597:1542724 [7] MCCL INFO Clique kernels disabled localhost:3808851:3809571 [0] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808851:3809571 [0] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808851:3809571 [0] MCCL INFO Clique kernels disabled localhost:1542597:1542724 [7] MCCL INFO Trees [0] 11/-1/-1->15->14 [1] 11/-1/-1->15->14 [2] 11/-1/-1->15->14 [3] 11/-1/-1->15->14 [4] 11/-1/-1->15->14 [5] 11/-1/-1->15->14 [6] 11/-1/-1->15->14 [7] 11/-1/-1->15->14 [8] 11/-1/-1->15->14 [9] 11/-1/-1->15->14 [10] 11/-1/-1->15->14 [11] 11/-1/-1->15->14 [12] 11/-1/-1->15->14 [13] 11/-1/-1->15->14 [14] 11/-1/-1->15->14 [15] 11/-1/-1->15->14 [16] 11/-1/-1->15->14 [17] 11/-1/-1->15->14 [18] 11/-1/-1->15->14 [19] 11/-1/-1->15->14 [20] 11/-1/-1->15->14 [21] 11/-1/-1->15->14 [22] 11/-1/-1->15->14 [23] 11/-1/-1->15->14 [24] 11/-1/-1->15->14 [25] 11/-1/-1->15->14 [26] 11/-1/-1->15->14 [27] 11/-1/-1->15->14 [28] 11/-1/-1->15->14 [29] 11/-1/-1->15->14 [30] 11/-1/-1->15->14 [31] 11/-1/-1->15->14 comm 0x7f75cdd40010 nRanks 16 busId 5e000 localhost:1542597:1542724 [7] MCCL INFO P2P Chunksize set to 131072 localhost:1542597:1542724 [7] MCCL INFO disable groupWriteback localhost:3808852:3809575 [1] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808854:3809576 [3] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808854:3809576 [3] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808854:3809576 [3] MCCL INFO Clique kernels disabled localhost:3808853:3809578 [2] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808853:3809578 [2] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808853:3809578 [2] MCCL INFO Clique kernels disabled localhost:3808855:3809573 [4] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808851:3809571 [0] MCCL INFO Channel 00/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 01/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 02/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 03/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 04/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 05/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 06/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 07/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808855:3809573 [4] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808855:3809573 [4] MCCL INFO Clique kernels disabled localhost:3808852:3809575 [1] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808852:3809575 [1] MCCL INFO Clique kernels disabled localhost:3808854:3809576 [3] MCCL INFO Trees [0] -1/-1/-1->3->7 [1] -1/-1/-1->3->7 [2] -1/-1/-1->3->7 [3] -1/-1/-1->3->7 [4] -1/-1/-1->3->7 [5] -1/-1/-1->3->7 [6] -1/-1/-1->3->7 [7] -1/-1/-1->3->7 [8] -1/-1/-1->3->7 [9] -1/-1/-1->3->7 [10] -1/-1/-1->3->7 [11] -1/-1/-1->3->7 [12] -1/-1/-1->3->7 [13] -1/-1/-1->3->7 [14] -1/-1/-1->3->7 [15] -1/-1/-1->3->7 [16] -1/-1/-1->3->7 [17] -1/-1/-1->3->7 [18] -1/-1/-1->3->7 [19] -1/-1/-1->3->7 [20] -1/-1/-1->3->7 [21] -1/-1/-1->3->7 [22] -1/-1/-1->3->7 [23] -1/-1/-1->3->7 [24] -1/-1/-1->3->7 [25] -1/-1/-1->3->7 [26] -1/-1/-1->3->7 [27] -1/-1/-1->3->7 [28] -1/-1/-1->3->7 [29] -1/-1/-1->3->7 [30] -1/-1/-1->3->7 [31] -1/-1/-1->3->7 comm 0x7f10c1b6a010 nRanks 16 busId f000 localhost:3808854:3809576 [3] MCCL INFO P2P Chunksize set to 131072 localhost:3808854:3809576 [3] MCCL INFO disable groupWriteback localhost:3808851:3809571 [0] MCCL INFO Channel 08/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 09/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 10/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 11/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 12/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 13/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 14/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 15/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 16/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 17/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 18/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808858:3809574 [6] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808858:3809574 [6] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808858:3809574 [6] MCCL INFO Clique kernels disabled localhost:3808864:3809579 [7] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808864:3809579 [7] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808864:3809579 [7] MCCL INFO Clique kernels disabled localhost:3808857:3809577 [5] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:3808857:3809577 [5] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:3808857:3809577 [5] MCCL INFO Clique kernels disabled localhost:3808857:3809577 [5] MCCL INFO Trees [0] 6/-1/-1->5->4 [1] 6/-1/-1->5->4 [2] 6/-1/-1->5->4 [3] 6/-1/-1->5->4 [4] 6/-1/-1->5->4 [5] 6/-1/-1->5->4 [6] 6/-1/-1->5->4 [7] 6/-1/-1->5->4 [8] 6/-1/-1->5->4 [9] 6/-1/-1->5->4 [10] 6/-1/-1->5->4 [11] 6/-1/-1->5->4 [12] 6/-1/-1->5->4 [13] 6/-1/-1->5->4 [14] 6/-1/-1->5->4 [15] 6/-1/-1->5->4 [16] 6/-1/-1->5->4 [17] 6/-1/-1->5->4 [18] 6/-1/-1->5->4 [19] 6/-1/-1->5->4 [20] 6/-1/-1->5->4 [21] 6/-1/-1->5->4 [22] 6/-1/-1->5->4 [23] 6/-1/-1->5->4 [24] 6/-1/-1->5->4 [25] 6/-1/-1->5->4 [26] 6/-1/-1->5->4 [27] 6/-1/-1->5->4 [28] 6/-1/-1->5->4 [29] 6/-1/-1->5->4 [30] 6/-1/-1->5->4 [31] 6/-1/-1->5->4 comm 0x7f3c2ee24010 nRanks 16 busId 56000 localhost:3808857:3809577 [5] MCCL INFO P2P Chunksize set to 131072 localhost:3808857:3809577 [5] MCCL INFO disable groupWriteback localhost:3808855:3809573 [4] MCCL INFO Trees [0] 5/-1/-1->4->2 [1] 5/-1/-1->4->2 [2] 5/-1/-1->4->2 [3] 5/-1/-1->4->2 [4] 5/-1/-1->4->2 [5] 5/-1/-1->4->2 [6] 5/-1/-1->4->2 [7] 5/-1/-1->4->2 [8] 5/-1/-1->4->2 [9] 5/-1/-1->4->2 [10] 5/-1/-1->4->2 [11] 5/-1/-1->4->2 [12] 5/-1/-1->4->2 [13] 5/-1/-1->4->2 [14] 5/-1/-1->4->2 [15] 5/-1/-1->4->2 [16] 5/-1/-1->4->2 [17] 5/-1/-1->4->2 [18] 5/-1/-1->4->2 [19] 5/-1/-1->4->2 [20] 5/-1/-1->4->2 [21] 5/-1/-1->4->2 [22] 5/-1/-1->4->2 [23] 5/-1/-1->4->2 [24] 5/-1/-1->4->2 [25] 5/-1/-1->4->2 [26] 5/-1/-1->4->2 [27] 5/-1/-1->4->2 [28] 5/-1/-1->4->2 [29] 5/-1/-1->4->2 [30] 5/-1/-1->4->2 [31] 5/-1/-1->4->2 comm 0x7fb633d40010 nRanks 16 busId 55000 localhost:3808855:3809573 [4] MCCL INFO P2P Chunksize set to 131072 localhost:3808855:3809573 [4] MCCL INFO disable groupWriteback localhost:3808852:3809575 [1] MCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 [2] 2/-1/-1->1->0 [3] 2/-1/-1->1->0 [4] 2/-1/-1->1->0 [5] 2/-1/-1->1->0 [6] 2/-1/-1->1->0 [7] 2/-1/-1->1->0 [8] 2/-1/-1->1->0 [9] 2/-1/-1->1->0 [10] 2/-1/-1->1->0 [11] 2/-1/-1->1->0 [12] 2/-1/-1->1->0 [13] 2/-1/-1->1->0 [14] 2/-1/-1->1->0 [15] 2/-1/-1->1->0 [16] 2/-1/-1->1->0 [17] 2/-1/-1->1->0 [18] 2/-1/-1->1->0 [19] 2/-1/-1->1->0 [20] 2/-1/-1->1->0 [21] 2/-1/-1->1->0 [22] 2/-1/-1->1->0 [23] 2/-1/-1->1->0 [24] 2/-1/-1->1->0 [25] 2/-1/-1->1->0 [26] 2/-1/-1->1->0 [27] 2/-1/-1->1->0 [28] 2/-1/-1->1->0 [29] 2/-1/-1->1->0 [30] 2/-1/-1->1->0 [31] 2/-1/-1->1->0 comm 0x7fd9b9c24010 nRanks 16 busId b000 localhost:3808852:3809575 [1] MCCL INFO P2P Chunksize set to 131072 localhost:3808852:3809575 [1] MCCL INFO disable groupWriteback localhost:3808853:3809578 [2] MCCL INFO Trees [0] 4/-1/-1->2->1 [1] 4/-1/-1->2->1 [2] 4/-1/-1->2->1 [3] 4/-1/-1->2->1 [4] 4/-1/-1->2->1 [5] 4/-1/-1->2->1 [6] 4/-1/-1->2->1 [7] 4/-1/-1->2->1 [8] 4/-1/-1->2->1 [9] 4/-1/-1->2->1 [10] 4/-1/-1->2->1 [11] 4/-1/-1->2->1 [12] 4/-1/-1->2->1 [13] 4/-1/-1->2->1 [14] 4/-1/-1->2->1 [15] 4/-1/-1->2->1 [16] 4/-1/-1->2->1 [17] 4/-1/-1->2->1 [18] 4/-1/-1->2->1 [19] 4/-1/-1->2->1 [20] 4/-1/-1->2->1 [21] 4/-1/-1->2->1 [22] 4/-1/-1->2->1 [23] 4/-1/-1->2->1 [24] 4/-1/-1->2->1 [25] 4/-1/-1->2->1 [26] 4/-1/-1->2->1 [27] 4/-1/-1->2->1 [28] 4/-1/-1->2->1 [29] 4/-1/-1->2->1 [30] 4/-1/-1->2->1 [31] 4/-1/-1->2->1 comm 0x7fc704e00010 nRanks 16 busId e000 localhost:3808853:3809578 [2] MCCL INFO P2P Chunksize set to 131072 localhost:3808853:3809578 [2] MCCL INFO disable groupWriteback localhost:3808851:3809571 [0] MCCL INFO Channel 19/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 20/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 21/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 22/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 23/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 24/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 25/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 26/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 27/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 28/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 29/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 30/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Channel 31/32 :    0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 localhost:3808851:3809571 [0] MCCL INFO Trees [0] 1/8/-1->0->-1 [1] 1/8/-1->0->-1 [2] 1/-1/-1->0->8 [3] 1/-1/-1->0->8 [4] 1/8/-1->0->-1 [5] 1/8/-1->0->-1 [6] 1/-1/-1->0->8 [7] 1/-1/-1->0->8 [8] 1/8/-1->0->-1 [9] 1/8/-1->0->-1 [10] 1/-1/-1->0->8 [11] 1/-1/-1->0->8 [12] 1/8/-1->0->-1 [13] 1/8/-1->0->-1 [14] 1/-1/-1->0->8 [15] 1/-1/-1->0->8 [16] 1/8/-1->0->-1 [17] 1/8/-1->0->-1 [18] 1/-1/-1->0->8 [19] 1/-1/-1->0->8 [20] 1/8/-1->0->-1 [21] 1/8/-1->0->-1 [22] 1/-1/-1->0->8 [23] 1/-1/-1->0->8 [24] 1/8/-1->0->-1 [25] 1/8/-1->0->-1 [26] 1/-1/-1->0->8 [27] 1/-1/-1->0->8 [28] 1/8/-1->0->-1 [29] 1/8/-1->0->-1 [30] 1/-1/-1->0->8 [31] 1/-1/-1->0->8 comm 0x7feed6140010 nRanks 16 busId 5000 localhost:3808851:3809571 [0] MCCL INFO P2P Chunksize set to 131072 localhost:3808851:3809571 [0] MCCL INFO disable groupWriteback localhost:1542596:1542727 [6] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542596:1542727 [6] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542596:1542727 [6] MCCL INFO Clique kernels disabled localhost:3808858:3809574 [6] MCCL INFO Trees [0] 7/-1/-1->6->5 [1] 7/-1/-1->6->5 [2] 7/-1/-1->6->5 [3] 7/-1/-1->6->5 [4] 7/-1/-1->6->5 [5] 7/-1/-1->6->5 [6] 7/-1/-1->6->5 [7] 7/-1/-1->6->5 [8] 7/-1/-1->6->5 [9] 7/-1/-1->6->5 [10] 7/-1/-1->6->5 [11] 7/-1/-1->6->5 [12] 7/-1/-1->6->5 [13] 7/-1/-1->6->5 [14] 7/-1/-1->6->5 [15] 7/-1/-1->6->5 [16] 7/-1/-1->6->5 [17] 7/-1/-1->6->5 [18] 7/-1/-1->6->5 [19] 7/-1/-1->6->5 [20] 7/-1/-1->6->5 [21] 7/-1/-1->6->5 [22] 7/-1/-1->6->5 [23] 7/-1/-1->6->5 [24] 7/-1/-1->6->5 [25] 7/-1/-1->6->5 [26] 7/-1/-1->6->5 [27] 7/-1/-1->6->5 [28] 7/-1/-1->6->5 [29] 7/-1/-1->6->5 [30] 7/-1/-1->6->5 [31] 7/-1/-1->6->5 comm 0x7fcf8a717010 nRanks 16 busId 5b000 localhost:3808858:3809574 [6] MCCL INFO P2P Chunksize set to 131072 localhost:3808858:3809574 [6] MCCL INFO disable groupWriteback localhost:3808864:3809579 [7] MCCL INFO Trees [0] 3/-1/-1->7->6 [1] 3/-1/-1->7->6 [2] 3/-1/-1->7->6 [3] 3/-1/-1->7->6 [4] 3/-1/-1->7->6 [5] 3/-1/-1->7->6 [6] 3/-1/-1->7->6 [7] 3/-1/-1->7->6 [8] 3/-1/-1->7->6 [9] 3/-1/-1->7->6 [10] 3/-1/-1->7->6 [11] 3/-1/-1->7->6 [12] 3/-1/-1->7->6 [13] 3/-1/-1->7->6 [14] 3/-1/-1->7->6 [15] 3/-1/-1->7->6 [16] 3/-1/-1->7->6 [17] 3/-1/-1->7->6 [18] 3/-1/-1->7->6 [19] 3/-1/-1->7->6 [20] 3/-1/-1->7->6 [21] 3/-1/-1->7->6 [22] 3/-1/-1->7->6 [23] 3/-1/-1->7->6 [24] 3/-1/-1->7->6 [25] 3/-1/-1->7->6 [26] 3/-1/-1->7->6 [27] 3/-1/-1->7->6 [28] 3/-1/-1->7->6 [29] 3/-1/-1->7->6 [30] 3/-1/-1->7->6 [31] 3/-1/-1->7->6 comm 0x7f14a6340010 nRanks 16 busId 5e000 localhost:3808864:3809579 [7] MCCL INFO P2P Chunksize set to 131072 localhost:3808864:3809579 [7] MCCL INFO disable groupWriteback localhost:1542593:1542726 [3] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542593:1542726 [3] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542593:1542726 [3] MCCL INFO Clique kernels disabled localhost:1542592:1542731 [2] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542592:1542731 [2] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542592:1542731 [2] MCCL INFO Clique kernels disabled localhost:1542594:1542728 [4] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542595:1542729 [5] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542595:1542729 [5] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542595:1542729 [5] MCCL INFO Clique kernels disabled localhost:1542595:1542729 [5] MCCL INFO Trees [0] 14/-1/-1->13->12 [1] 14/-1/-1->13->12 [2] 14/-1/-1->13->12 [3] 14/-1/-1->13->12 [4] 14/-1/-1->13->12 [5] 14/-1/-1->13->12 [6] 14/-1/-1->13->12 [7] 14/-1/-1->13->12 [8] 14/-1/-1->13->12 [9] 14/-1/-1->13->12 [10] 14/-1/-1->13->12 [11] 14/-1/-1->13->12 [12] 14/-1/-1->13->12 [13] 14/-1/-1->13->12 [14] 14/-1/-1->13->12 [15] 14/-1/-1->13->12 [16] 14/-1/-1->13->12 [17] 14/-1/-1->13->12 [18] 14/-1/-1->13->12 [19] 14/-1/-1->13->12 [20] 14/-1/-1->13->12 [21] 14/-1/-1->13->12 [22] 14/-1/-1->13->12 [23] 14/-1/-1->13->12 [24] 14/-1/-1->13->12 [25] 14/-1/-1->13->12 [26] 14/-1/-1->13->12 [27] 14/-1/-1->13->12 [28] 14/-1/-1->13->12 [29] 14/-1/-1->13->12 [30] 14/-1/-1->13->12 [31] 14/-1/-1->13->12 comm 0x7f93ef76a010 nRanks 16 busId 56000 localhost:1542595:1542729 [5] MCCL INFO P2P Chunksize set to 131072 localhost:1542595:1542729 [5] MCCL INFO disable groupWriteback localhost:1542590:1542725 [0] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542590:1542725 [0] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542590:1542725 [0] MCCL INFO Clique kernels disabled localhost:1542590:1542725 [0] MCCL INFO Trees [0] 9/-1/-1->8->0 [1] 9/-1/-1->8->0 [2] 9/0/-1->8->-1 [3] 9/0/-1->8->-1 [4] 9/-1/-1->8->0 [5] 9/-1/-1->8->0 [6] 9/0/-1->8->-1 [7] 9/0/-1->8->-1 [8] 9/-1/-1->8->0 [9] 9/-1/-1->8->0 [10] 9/0/-1->8->-1 [11] 9/0/-1->8->-1 [12] 9/-1/-1->8->0 [13] 9/-1/-1->8->0 [14] 9/0/-1->8->-1 [15] 9/0/-1->8->-1 [16] 9/-1/-1->8->0 [17] 9/-1/-1->8->0 [18] 9/0/-1->8->-1 [19] 9/0/-1->8->-1 [20] 9/-1/-1->8->0 [21] 9/-1/-1->8->0 [22] 9/0/-1->8->-1 [23] 9/0/-1->8->-1 [24] 9/-1/-1->8->0 [25] 9/-1/-1->8->0 [26] 9/0/-1->8->-1 [27] 9/0/-1->8->-1 [28] 9/-1/-1->8->0 [29] 9/-1/-1->8->0 [30] 9/0/-1->8->-1 [31] 9/0/-1->8->-1 comm 0x7fced0b40010 nRanks 16 busId 5000 localhost:1542590:1542725 [0] MCCL INFO P2P Chunksize set to 131072 localhost:1542590:1542725 [0] MCCL INFO disable groupWriteback localhost:1542591:1542730 [1] MCCL INFO Disabling clique-based kernels due to topology (ignore with CLIQUE_IGNORE_TOPO) localhost:1542591:1542730 [1] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542591:1542730 [1] MCCL INFO Clique kernels disabled localhost:1542591:1542730 [1] MCCL INFO Trees [0] 10/-1/-1->9->8 [1] 10/-1/-1->9->8 [2] 10/-1/-1->9->8 [3] 10/-1/-1->9->8 [4] 10/-1/-1->9->8 [5] 10/-1/-1->9->8 [6] 10/-1/-1->9->8 [7] 10/-1/-1->9->8 [8] 10/-1/-1->9->8 [9] 10/-1/-1->9->8 [10] 10/-1/-1->9->8 [11] 10/-1/-1->9->8 [12] 10/-1/-1->9->8 [13] 10/-1/-1->9->8 [14] 10/-1/-1->9->8 [15] 10/-1/-1->9->8 [16] 10/-1/-1->9->8 [17] 10/-1/-1->9->8 [18] 10/-1/-1->9->8 [19] 10/-1/-1->9->8 [20] 10/-1/-1->9->8 [21] 10/-1/-1->9->8 [22] 10/-1/-1->9->8 [23] 10/-1/-1->9->8 [24] 10/-1/-1->9->8 [25] 10/-1/-1->9->8 [26] 10/-1/-1->9->8 [27] 10/-1/-1->9->8 [28] 10/-1/-1->9->8 [29] 10/-1/-1->9->8 [30] 10/-1/-1->9->8 [31] 10/-1/-1->9->8 comm 0x7f1e96c00010 nRanks 16 busId b000 localhost:1542591:1542730 [1] MCCL INFO P2P Chunksize set to 131072 localhost:1542591:1542730 [1] MCCL INFO disable groupWriteback localhost:1542593:1542726 [3] MCCL INFO Trees [0] -1/-1/-1->11->15 [1] -1/-1/-1->11->15 [2] -1/-1/-1->11->15 [3] -1/-1/-1->11->15 [4] -1/-1/-1->11->15 [5] -1/-1/-1->11->15 [6] -1/-1/-1->11->15 [7] -1/-1/-1->11->15 [8] -1/-1/-1->11->15 [9] -1/-1/-1->11->15 [10] -1/-1/-1->11->15 [11] -1/-1/-1->11->15 [12] -1/-1/-1->11->15 [13] -1/-1/-1->11->15 [14] -1/-1/-1->11->15 [15] -1/-1/-1->11->15 [16] -1/-1/-1->11->15 [17] -1/-1/-1->11->15 [18] -1/-1/-1->11->15 [19] -1/-1/-1->11->15 [20] -1/-1/-1->11->15 [21] -1/-1/-1->11->15 [22] -1/-1/-1->11->15 [23] -1/-1/-1->11->15 [24] -1/-1/-1->11->15 [25] -1/-1/-1->11->15 [26] -1/-1/-1->11->15 [27] -1/-1/-1->11->15 [28] -1/-1/-1->11->15 [29] -1/-1/-1->11->15 [30] -1/-1/-1->11->15 [31] -1/-1/-1->11->15 comm 0x7f4964024010 nRanks 16 busId f000 localhost:1542593:1542726 [3] MCCL INFO P2P Chunksize set to 131072 localhost:1542593:1542726 [3] MCCL INFO disable groupWriteback localhost:3808854:3809576 [3] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:3808854:3809576 [3] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542592:1542731 [2] MCCL INFO Trees [0] 12/-1/-1->10->9 [1] 12/-1/-1->10->9 [2] 12/-1/-1->10->9 [3] 12/-1/-1->10->9 [4] 12/-1/-1->10->9 [5] 12/-1/-1->10->9 [6] 12/-1/-1->10->9 [7] 12/-1/-1->10->9 [8] 12/-1/-1->10->9 [9] 12/-1/-1->10->9 [10] 12/-1/-1->10->9 [11] 12/-1/-1->10->9 [12] 12/-1/-1->10->9 [13] 12/-1/-1->10->9 [14] 12/-1/-1->10->9 [15] 12/-1/-1->10->9 [16] 12/-1/-1->10->9 [17] 12/-1/-1->10->9 [18] 12/-1/-1->10->9 [19] 12/-1/-1->10->9 [20] 12/-1/-1->10->9 [21] 12/-1/-1->10->9 [22] 12/-1/-1->10->9 [23] 12/-1/-1->10->9 [24] 12/-1/-1->10->9 [25] 12/-1/-1->10->9 [26] 12/-1/-1->10->9 [27] 12/-1/-1->10->9 [28] 12/-1/-1->10->9 [29] 12/-1/-1->10->9 [30] 12/-1/-1->10->9 [31] 12/-1/-1->10->9 comm 0x7fe15ae00010 nRanks 16 busId e000 localhost:1542592:1542731 [2] MCCL INFO P2P Chunksize set to 131072 localhost:1542592:1542731 [2] MCCL INFO disable groupWriteback localhost:1542594:1542728 [4] MCCL INFO CLIQUE InitRemoteReadToggle enable_read:0, isaVersion:1000, topologyId:2 localhost:1542594:1542728 [4] MCCL INFO Clique kernels disabled localhost:1542594:1542728 [4] MCCL INFO Trees [0] 13/-1/-1->12->10 [1] 13/-1/-1->12->10 [2] 13/-1/-1->12->10 [3] 13/-1/-1->12->10 [4] 13/-1/-1->12->10 [5] 13/-1/-1->12->10 [6] 13/-1/-1->12->10 [7] 13/-1/-1->12->10 [8] 13/-1/-1->12->10 [9] 13/-1/-1->12->10 [10] 13/-1/-1->12->10 [11] 13/-1/-1->12->10 [12] 13/-1/-1->12->10 [13] 13/-1/-1->12->10 [14] 13/-1/-1->12->10 [15] 13/-1/-1->12->10 [16] 13/-1/-1->12->10 [17] 13/-1/-1->12->10 [18] 13/-1/-1->12->10 [19] 13/-1/-1->12->10 [20] 13/-1/-1->12->10 [21] 13/-1/-1->12->10 [22] 13/-1/-1->12->10 [23] 13/-1/-1->12->10 [24] 13/-1/-1->12->10 [25] 13/-1/-1->12->10 [26] 13/-1/-1->12->10 [27] 13/-1/-1->12->10 [28] 13/-1/-1->12->10 [29] 13/-1/-1->12->10 [30] 13/-1/-1->12->10 [31] 13/-1/-1->12->10 comm 0x7fef53940010 nRanks 16 busId 55000 localhost:1542596:1542727 [6] MCCL INFO Trees [0] 15/-1/-1->14->13 [1] 15/-1/-1->14->13 [2] 15/-1/-1->14->13 [3] 15/-1/-1->14->13 [4] 15/-1/-1->14->13 [5] 15/-1/-1->14->13 [6] 15/-1/-1->14->13 [7] 15/-1/-1->14->13 [8] 15/-1/-1->14->13 [9] 15/-1/-1->14->13 [10] 15/-1/-1->14->13 [11] 15/-1/-1->14->13 [12] 15/-1/-1->14->13 [13] 15/-1/-1->14->13 [14] 15/-1/-1->14->13 [15] 15/-1/-1->14->13 [16] 15/-1/-1->14->13 [17] 15/-1/-1->14->13 [18] 15/-1/-1->14->13 [19] 15/-1/-1->14->13 [20] 15/-1/-1->14->13 [21] 15/-1/-1->14->13 [22] 15/-1/-1->14->13 [23] 15/-1/-1->14->13 [24] 15/-1/-1->14->13 [25] 15/-1/-1->14->13 [26] 15/-1/-1->14->13 [27] 15/-1/-1->14->13 [28] 15/-1/-1->14->13 [29] 15/-1/-1->14->13 [30] 15/-1/-1->14->13 [31] 15/-1/-1->14->13 comm 0x7efc82c24010 nRanks 16 busId 5b000 localhost:1542596:1542727 [6] MCCL INFO P2P Chunksize set to 131072 localhost:1542596:1542727 [6] MCCL INFO disable groupWriteback localhost:1542594:1542728 [4] MCCL INFO P2P Chunksize set to 131072 localhost:1542594:1542728 [4] MCCL INFO disable groupWriteback localhost:1542593:1542726 [3] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542593:1542726 [3] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542591:1542730 [1] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542591:1542730 [1] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808853:3809578 [2] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:3808853:3809578 [2] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542592:1542731 [2] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542592:1542731 [2] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542590:1542725 [0] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:3808852:3809575 [1] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542590:1542725 [0] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808852:3809575 [1] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542597:1542724 [7] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542597:1542724 [7] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808857:3809577 [5] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:3808858:3809574 [6] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:3808857:3809577 [5] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808858:3809574 [6] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542596:1542727 [6] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542595:1542729 [5] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542595:1542729 [5] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808851:3809571 [0] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542596:1542727 [6] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808854:3809698 [3] MCCL INFO New proxy send connection 0 from local rank 3, transport 2 localhost:1542594:1542728 [4] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542594:1542728 [4] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808851:3809571 [0] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542593:1542816 [3] MCCL INFO New proxy send connection 0 from local rank 3, transport 2 localhost:3808854:3809576 [3] MCCL INFO Connection to proxy localRank 3 -> connection 0x7f0f90001010 localhost:3808855:3809573 [4] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542593:1542726 [3] MCCL INFO Connection to proxy localRank 3 -> connection 0x7f483c001010 localhost:3808855:3809573 [4] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:3808864:3809579 [7] MCCL INFO threadThresholds 8/8/64 | 8/8/64 | 256 | 256 localhost:1542591:1542821 [1] MCCL INFO New proxy send connection 0 from local rank 1, transport 2 localhost:3808864:3809579 [7] MCCL INFO 32 coll channels, 32 p2p channels, 4 p2p channels per peer localhost:1542591:1542730 [1] MCCL INFO Connection to proxy localRank 1 -> connection 0x7f1d68001010 localhost:1542592:1542826 [2] MCCL INFO New proxy send connection 0 from local rank 2, transport 2 localhost:3808852:3809712 [1] MCCL INFO New proxy send connection 0 from local rank 1, transport 2 localhost:3808852:3809575 [1] MCCL INFO Connection to proxy localRank 1 -> connection 0x7fd888001010 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 0 from local rank 7, transport 2 localhost:1542597:1542724 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001010 localhost:3808853:3809701 [2] MCCL INFO New proxy send connection 0 from local rank 2, transport 2 localhost:1542590:1542823 [0] MCCL INFO New proxy send connection 0 from local rank 0, transport 2 localhost:3808853:3809578 [2] MCCL INFO Connection to proxy localRank 2 -> connection 0x7fc5dc001010 localhost:1542595:1542820 [5] MCCL INFO New proxy send connection 0 from local rank 5, transport 2 localhost:1542596:1542819 [6] MCCL INFO New proxy send connection 0 from local rank 6, transport 2 localhost:3808857:3809706 [5] MCCL INFO New proxy send connection 0 from local rank 5, transport 2 localhost:1542592:1542731 [2] MCCL INFO Connection to proxy localRank 2 -> connection 0x7fe02c001010 localhost:1542590:1542725 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001010 localhost:1542595:1542729 [5] MCCL INFO Connection to proxy localRank 5 -> connection 0x7f92c8001010 localhost:1542594:1542813 [4] MCCL INFO New proxy send connection 0 from local rank 4, transport 2 localhost:1542594:1542728 [4] MCCL INFO Connection to proxy localRank 4 -> connection 0x7fee2c001010 localhost:3808851:3809710 [0] MCCL INFO New proxy send connection 0 from local rank 0, transport 2 localhost:3808851:3809571 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001010 localhost:1542596:1542727 [6] MCCL INFO Connection to proxy localRank 6 -> connection 0x7efb54001010 localhost:3808857:3809577 [5] MCCL INFO Connection to proxy localRank 5 -> connection 0x7f3b00001010 localhost:3808858:3809705 [6] MCCL INFO New proxy send connection 0 from local rank 6, transport 2 localhost:3808855:3809704 [4] MCCL INFO New proxy send connection 0 from local rank 4, transport 2 localhost:3808855:3809573 [4] MCCL INFO Connection to proxy localRank 4 -> connection 0x7fb504001010 localhost:3808858:3809574 [6] MCCL INFO Connection to proxy localRank 6 -> connection 0x7fce5c001010 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 0 from local rank 7, transport 2 localhost:3808864:3809579 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001010 localhost:3808854:3809576 [3] MCCL INFO comm 0x7f10c1b6a010 rank 3 nranks 16 macaDev 3 busId f000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:3808857:3809577 [5] MCCL INFO comm 0x7f3c2ee24010 rank 5 nranks 16 macaDev 5 busId 56000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:3808852:3809575 [1] MCCL INFO comm 0x7fd9b9c24010 rank 1 nranks 16 macaDev 1 busId b000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:3808864:3809579 [7] MCCL INFO comm 0x7f14a6340010 rank 7 nranks 16 macaDev 7 busId 5e000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:3808858:3809574 [6] MCCL INFO comm 0x7fcf8a717010 rank 6 nranks 16 macaDev 6 busId 5b000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:3808855:3809573 [4] MCCL INFO comm 0x7fb633d40010 rank 4 nranks 16 macaDev 4 busId 55000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:3808853:3809578 [2] MCCL INFO comm 0x7fc704e00010 rank 2 nranks 16 macaDev 2 busId e000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:3808851:3809571 [0] MCCL INFO comm 0x7feed6140010 rank 0 nranks 16 macaDev 0 busId 5000 localSize 0 used 67724896 bytes - Init COMPLETE # #                                                           ┌----- out-of-place ------┐       ┌------ in-place -------┐ #        size         count      type   redop    root      time    algbw   busbw   #wrong    time   algbw   busbw   #wrong #         (B)    (elements)                                (us)   (GB/s)  (GB/s)              (us)  (GB/s)  (GB/s)        localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 1 from local rank 0, transport 2 localhost:1542593:1542726 [3] MCCL INFO comm 0x7f4964024010 rank 11 nranks 16 macaDev 3 busId f000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542596:1542727 [6] MCCL INFO comm 0x7efc82c24010 rank 14 nranks 16 macaDev 6 busId 5b000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542597:1542724 [7] MCCL INFO comm 0x7f75cdd40010 rank 15 nranks 16 macaDev 7 busId 5e000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542594:1542728 [4] MCCL INFO comm 0x7fef53940010 rank 12 nranks 16 macaDev 4 busId 55000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542592:1542731 [2] MCCL INFO comm 0x7fe15ae00010 rank 10 nranks 16 macaDev 2 busId e000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542591:1542730 [1] MCCL INFO comm 0x7f1e96c00010 rank 9 nranks 16 macaDev 1 busId b000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542595:1542729 [5] MCCL INFO comm 0x7f93ef76a010 rank 13 nranks 16 macaDev 5 busId 56000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542590:1542725 [0] MCCL INFO comm 0x7fced0b40010 rank 8 nranks 16 macaDev 0 busId 5000 localSize 0 used 67724896 bytes - Init COMPLETE localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 1 from local rank 0, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001058 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 00/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 2 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001058 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 00/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 2 from local rank 0, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80010a0 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 01/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 3 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80010a0 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 01/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 3 from local rank 0, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80010e8 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 02/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 4 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80010e8 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 02/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 4 from local rank 0, transport 2 localhost:3808853:3809774 [2] MCCL INFO Channel 00 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001130 localhost:3808853:3809774 [2] MCCL INFO Channel 01 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 03/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 5 from local rank 0, transport 2 localhost:3808853:3809774 [2] MCCL INFO Channel 02 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 03 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 04 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 05 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 00 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 06 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 07 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 01 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 02 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 03 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 08 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 09 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 10 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 04 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 11 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 05 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 12 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 06 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 13 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 07 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 14 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 08 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 15 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 09 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 16 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 10 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 17 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 18 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 11 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 19 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 12 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 20 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 13 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 21 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 14 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 22 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 15 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 23 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 16 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 24 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 17 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 00 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 25 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 18 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 26 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 01 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 19 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 27 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 02 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 20 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 28 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 21 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 03 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 29 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 22 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 04 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 30 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 23 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 05 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808853:3809774 [2] MCCL INFO Channel 31 : 2[e000] -> 3[f000] via SHM/direct/direct comm 0x7fc704e00010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 24 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 06 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 25 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 07 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 26 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 08 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 27 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 09 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 28 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 10 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 29 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 11 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 30 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 12 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808852:3809773 [1] MCCL INFO Channel 31 : 1[b000] -> 2[e000] via SHM/direct/direct comm 0x7fd9b9c24010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 13 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 14 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 15 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 16 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 17 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 18 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 19 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 20 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 21 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 22 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 23 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 24 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 25 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 26 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 27 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 28 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 29 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 30 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:3808854:3809769 [3] MCCL INFO Channel 31 : 3[f000] -> 4[55000] via SHM/direct/direct comm 0x7f10c1b6a010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 00 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 01 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 02 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 03 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 04 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 05 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 06 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 07 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 08 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 09 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 10 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 11 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 12 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 13 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 14 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 15 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 16 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001130 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 03/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 17 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 18 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 5 from local rank 0, transport 2 localhost:1542593:1542895 [3] MCCL INFO Channel 19 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 20 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 21 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 22 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 23 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 24 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 25 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 26 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 27 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 28 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 29 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 30 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542593:1542895 [3] MCCL INFO Channel 31 : 11[f000] -> 12[55000] via SHM/direct/direct comm 0x7f4964024010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 00 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 01 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 02 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 03 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 04 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 05 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 06 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 07 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 08 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 09 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 10 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 11 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 12 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 13 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 14 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 15 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 16 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 17 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 18 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 00 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 19 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 20 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 01 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 21 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 02 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 22 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 03 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 23 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 04 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 24 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 05 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 25 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 06 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 26 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 07 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 27 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 08 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 28 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 09 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 29 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 10 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 30 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 11 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542595:1542900 [5] MCCL INFO Channel 31 : 13[56000] -> 14[5b000] via SHM/direct/direct comm 0x7f93ef76a010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 12 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 13 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 14 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 15 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 16 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 17 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 18 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 19 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 20 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 21 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 22 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 23 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 24 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 25 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 26 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 27 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 28 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 29 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 30 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542594:1542896 [4] MCCL INFO Channel 31 : 12[55000] -> 13[56000] via SHM/direct/direct comm 0x7fef53940010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 00 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 01 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 02 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 03 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 04 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 05 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 06 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 07 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 08 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 09 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 10 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 11 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 12 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 13 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 14 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 15 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 16 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 17 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 18 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 19 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 20 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 21 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 22 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 23 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 24 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 25 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 26 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 27 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 28 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 29 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 30 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542596:1542902 [6] MCCL INFO Channel 31 : 14[5b000] -> 15[5e000] via SHM/direct/direct comm 0x7efc82c24010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 00 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 00 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001178 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 04/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 01 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 6 from local rank 0, transport 2 localhost:1542591:1542899 [1] MCCL INFO Channel 01 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 1 from local rank 7, transport 2 localhost:1542591:1542899 [1] MCCL INFO Channel 02 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 02 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 03 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 03 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 04 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 04 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 05 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 05 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 06 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 06 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 07 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 07 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 08 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 08 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 09 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 09 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 10 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 10 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 11 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 11 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 12 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 12 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 13 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 13 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 14 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 14 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 15 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 15 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 16 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 16 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 17 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 17 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 18 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 18 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 19 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 19 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 20 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 20 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 21 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 21 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 22 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 22 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 23 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 23 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 24 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 24 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 25 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 25 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 26 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 27 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 26 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 28 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 27 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 29 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 28 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 30 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 29 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542591:1542899 [1] MCCL INFO Channel 31 : 9[b000] -> 10[e000] via SHM/direct/direct comm 0x7f1e96c00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 30 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542592:1542897 [2] MCCL INFO Channel 31 : 10[e000] -> 11[f000] via SHM/direct/direct comm 0x7fe15ae00010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001178 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 04/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 6 from local rank 0, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001058 localhost:1542597:1542901 [7] MCCL INFO Channel 00/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 2 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80011c0 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 05/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 7 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80011c0 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 05/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 7 from local rank 0, transport 2 localhost:3808855:3809772 [4] MCCL INFO Channel 00 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 01 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 02 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 03 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 04 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 05 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 1 from local rank 7, transport 2 localhost:3808855:3809772 [4] MCCL INFO Channel 06 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 07 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 08 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 09 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 10 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 11 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 12 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 13 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 14 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 15 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 16 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 17 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 18 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 19 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 20 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 21 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 22 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 23 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 24 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 25 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 26 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 27 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 28 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 29 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 30 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:3808855:3809772 [4] MCCL INFO Channel 31 : 4[55000] -> 5[56000] via SHM/direct/direct comm 0x7fb633d40010 nRanks 16 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00010a0 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001208 localhost:1542597:1542901 [7] MCCL INFO Channel 01/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 3 from local rank 7, transport 2 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 06/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 8 from local rank 0, transport 2 localhost:3808857:3809770 [5] MCCL INFO Channel 00 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 01 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 02 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 03 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 04 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 05 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 06 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 07 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 08 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 09 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 10 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 11 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 12 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 13 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 14 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 15 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 00 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 16 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 01 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 17 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 02 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 18 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 19 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 03 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 20 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 04 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 21 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 22 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 05 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 23 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 06 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 24 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 07 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 25 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 08 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 26 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 09 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 27 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 10 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 28 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 11 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 29 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 12 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 30 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 13 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808857:3809770 [5] MCCL INFO Channel 31 : 5[56000] -> 6[5b000] via SHM/direct/direct comm 0x7f3c2ee24010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 14 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 15 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 16 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 17 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 18 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 19 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 20 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 21 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 22 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 23 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 24 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 25 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 26 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 27 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 28 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 29 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 30 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:3808858:3809775 [6] MCCL INFO Channel 31 : 6[5b000] -> 7[5e000] via SHM/direct/direct comm 0x7fcf8a717010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001208 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 06/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 8 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001058 localhost:3808864:3809771 [7] MCCL INFO Channel 00/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 2 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00010e8 localhost:1542597:1542901 [7] MCCL INFO Channel 02/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 4 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001250 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 07/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 9 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001250 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 07/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 9 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780010a0 localhost:3808864:3809771 [7] MCCL INFO Channel 01/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 3 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001130 localhost:1542597:1542901 [7] MCCL INFO Channel 03/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 5 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001298 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 08/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 10 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001298 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 08/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 10 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780010e8 localhost:3808864:3809771 [7] MCCL INFO Channel 02/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 4 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001178 localhost:1542597:1542901 [7] MCCL INFO Channel 04/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 6 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80012e0 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 09/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 11 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80012e0 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 09/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 11 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001130 localhost:3808864:3809771 [7] MCCL INFO Channel 03/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 5 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00011c0 localhost:1542597:1542901 [7] MCCL INFO Channel 05/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 7 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001328 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 10/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 12 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001328 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 10/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 12 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001178 localhost:3808864:3809771 [7] MCCL INFO Channel 04/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 6 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001208 localhost:1542597:1542901 [7] MCCL INFO Channel 06/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 8 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001370 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 11/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 13 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001370 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 11/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 13 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780011c0 localhost:3808864:3809771 [7] MCCL INFO Channel 05/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 7 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001250 localhost:1542597:1542901 [7] MCCL INFO Channel 07/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 9 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80013b8 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 12/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 14 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80013b8 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 12/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 14 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001208 localhost:3808864:3809771 [7] MCCL INFO Channel 06/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 8 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001298 localhost:1542597:1542901 [7] MCCL INFO Channel 08/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 10 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001400 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 13/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 15 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001400 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 13/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 15 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001250 localhost:3808864:3809771 [7] MCCL INFO Channel 07/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 9 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001448 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00012e0 localhost:1542597:1542901 [7] MCCL INFO Channel 09/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 11 from local rank 7, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001448 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 14/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 16 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001298 localhost:3808864:3809771 [7] MCCL INFO Channel 08/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 10 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001328 localhost:1542597:1542901 [7] MCCL INFO Channel 10/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 12 from local rank 7, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001490 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780012e0 localhost:3808864:3809771 [7] MCCL INFO Channel 09/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 11 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001370 localhost:1542597:1542901 [7] MCCL INFO Channel 11/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 13 from local rank 7, transport 2 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 14/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 16 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001328 localhost:3808864:3809771 [7] MCCL INFO Channel 10/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 12 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00013b8 localhost:1542597:1542901 [7] MCCL INFO Channel 12/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 14 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001490 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 15/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 17 from local rank 0, transport 2 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 15/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 17 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001370 localhost:3808864:3809771 [7] MCCL INFO Channel 11/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 13 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001400 localhost:1542597:1542901 [7] MCCL INFO Channel 13/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 15 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80014d8 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 16/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 18 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80014d8 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 16/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 18 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780013b8 localhost:3808864:3809771 [7] MCCL INFO Channel 12/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 14 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001448 localhost:1542597:1542901 [7] MCCL INFO Channel 14/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 16 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001520 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 17/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 19 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001520 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 17/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 19 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001400 localhost:3808864:3809771 [7] MCCL INFO Channel 13/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 15 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001490 localhost:1542597:1542901 [7] MCCL INFO Channel 15/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 17 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001568 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 18/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 20 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001568 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 18/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 20 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001448 localhost:3808864:3809771 [7] MCCL INFO Channel 14/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 16 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00014d8 localhost:1542597:1542901 [7] MCCL INFO Channel 16/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 18 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80015b0 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 19/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 21 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80015b0 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 19/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 21 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001490 localhost:3808864:3809771 [7] MCCL INFO Channel 15/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 17 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001520 localhost:1542597:1542901 [7] MCCL INFO Channel 17/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 19 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80015f8 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 20/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 22 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80015f8 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 20/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 22 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780014d8 localhost:3808864:3809771 [7] MCCL INFO Channel 16/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 18 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001568 localhost:1542597:1542901 [7] MCCL INFO Channel 18/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 20 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001640 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 21/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 23 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001640 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 21/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 23 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001520 localhost:3808864:3809771 [7] MCCL INFO Channel 17/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 19 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00015b0 localhost:1542597:1542901 [7] MCCL INFO Channel 19/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 21 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001688 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 22/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 24 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001688 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 22/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 24 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001568 localhost:3808864:3809771 [7] MCCL INFO Channel 18/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 20 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00015f8 localhost:1542597:1542901 [7] MCCL INFO Channel 20/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 22 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80016d0 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 23/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 25 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80016d0 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 23/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 25 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780015b0 localhost:3808864:3809771 [7] MCCL INFO Channel 19/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 21 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001640 localhost:1542597:1542901 [7] MCCL INFO Channel 21/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 23 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001718 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 24/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 26 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001718 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 24/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 26 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780015f8 localhost:3808864:3809771 [7] MCCL INFO Channel 20/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 22 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001688 localhost:1542597:1542901 [7] MCCL INFO Channel 22/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 24 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001760 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 25/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 27 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001760 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 25/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 27 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001640 localhost:3808864:3809771 [7] MCCL INFO Channel 21/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 23 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00016d0 localhost:1542597:1542901 [7] MCCL INFO Channel 23/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 25 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80017a8 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 26/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 28 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80017a8 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 26/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 28 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001688 localhost:3808864:3809771 [7] MCCL INFO Channel 22/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 24 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001718 localhost:1542597:1542901 [7] MCCL INFO Channel 24/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 26 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80017f0 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 27/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 29 from local rank 0, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80017f0 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 27/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 29 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780016d0 localhost:3808864:3809771 [7] MCCL INFO Channel 23/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 25 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001760 localhost:1542597:1542901 [7] MCCL INFO Channel 25/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 27 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001838 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 28/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 30 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001718 localhost:3808864:3809771 [7] MCCL INFO Channel 24/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 26 from local rank 7, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001838 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 28/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 30 from local rank 0, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00017a8 localhost:1542597:1542901 [7] MCCL INFO Channel 26/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 28 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001880 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 29/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 31 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001760 localhost:3808864:3809771 [7] MCCL INFO Channel 25/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 27 from local rank 7, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001880 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 29/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 31 from local rank 0, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00017f0 localhost:1542597:1542901 [7] MCCL INFO Channel 27/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 29 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda80018c8 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 30/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809710 [0] MCCL INFO New proxy recv connection 32 from local rank 0, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780017a8 localhost:3808864:3809771 [7] MCCL INFO Channel 26/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 28 from local rank 7, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda80018c8 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 30/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542823 [0] MCCL INFO New proxy recv connection 32 from local rank 0, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001838 localhost:1542597:1542901 [7] MCCL INFO Channel 28/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 30 from local rank 7, transport 2 localhost:3808851:3809776 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7feda8001910 localhost:3808851:3809710 [0] MCCL INFO ip=192.168.100.13, port=0 localhost:3808851:3809776 [0] MCCL INFO Channel 31/0 : 15[5e000] -> 0[5000] [receive] via NET/IB/0 comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 00 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 01 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 02 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 03 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 04 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 05 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 06 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 07 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 08 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 09 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 10 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 11 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 12 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 13 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 14 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 15 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 16 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 17 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 18 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 19 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 20 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 21 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 22 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 23 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 24 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 25 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 26 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 27 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 28 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 29 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 30 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808851:3809776 [0] MCCL INFO Channel 31 : 0[5000] -> 1[b000] via SHM/direct/direct comm 0x7feed6140010 nRanks 16 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780017f0 localhost:3808864:3809771 [7] MCCL INFO Channel 27/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 29 from local rank 7, transport 2 localhost:1542590:1542898 [0] MCCL INFO Connection to proxy localRank 0 -> connection 0x7fcda8001910 localhost:1542590:1542823 [0] MCCL INFO ip=192.168.100.14, port=0 localhost:1542590:1542898 [0] MCCL INFO Channel 31/0 : 7[5e000] -> 8[5000] [receive] via NET/IB/0 comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 00 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 01 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 02 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 03 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 04 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 05 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 06 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 07 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 08 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 09 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 10 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 11 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 12 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 13 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 14 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 15 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 16 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 17 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 18 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 19 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 20 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 21 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 22 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 23 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 24 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 25 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 26 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 27 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 28 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 29 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 30 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542590:1542898 [0] MCCL INFO Channel 31 : 8[5000] -> 9[b000] via SHM/direct/direct comm 0x7fced0b40010 nRanks 16 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001880 localhost:1542597:1542901 [7] MCCL INFO Channel 29/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 31 from local rank 7, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001838 localhost:3808864:3809771 [7] MCCL INFO Channel 28/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 30 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a00018c8 localhost:1542597:1542901 [7] MCCL INFO Channel 30/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO New proxy send connection 32 from local rank 7, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001880 localhost:3808864:3809771 [7] MCCL INFO Channel 29/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 31 from local rank 7, transport 2 localhost:1542597:1542901 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f74a0001910 localhost:1542597:1542901 [7] MCCL INFO Channel 31/0 : 15[5e000] -> 0[5000] [send] via NET/IB/0 comm 0x7f75cdd40010 nRanks 16 localhost:1542597:1542822 [7] MCCL INFO NET/IB: Dev 0 Port 1 qpn 344 mtu 5 GID 3 (0/E64A8C0FFFF0000) localhost:1542597:1542822 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/ibvwrap.cc:309 MCCL WARN Call to ibv_reg_mr failed with error Invalid argument localhost:1542597:1542822 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net_ib.cc:1328 -> 2 localhost:1542597:1542822 [7] MCCL INFO /workspace/communication/mccl/src/include/net.h:53 -> 2 localhost:1542597:1542822 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:789 -> 2 localhost:1542597:1542822 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1120 -> 2 localhost:1542597:1542822 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1169 -> 2 localhost:1542597:1542822 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1312 MCCL WARN [Proxy Service 15] Failed to execute operation Connect from rank 15, retcode 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f13780018c8 localhost:3808864:3809771 [7] MCCL INFO Channel 30/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO New proxy send connection 32 from local rank 7, transport 2 localhost:3808864:3809771 [7] MCCL INFO Connection to proxy localRank 7 -> connection 0x7f1378001910 localhost:3808864:3809771 [7] MCCL INFO Channel 31/0 : 7[5e000] -> 8[5000] [send] via NET/IB/0 comm 0x7f14a6340010 nRanks 16 localhost:3808864:3809707 [7] MCCL INFO NET/IB: Dev 0 Port 1 qpn 345 mtu 5 GID 3 (0/D64A8C0FFFF0000) localhost:3808864:3809707 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/ibvwrap.cc:309 MCCL WARN Call to ibv_reg_mr failed with error Invalid argument localhost:3808864:3809707 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net_ib.cc:1328 -> 2 localhost:3808864:3809707 [7] MCCL INFO /workspace/communication/mccl/src/include/net.h:53 -> 2 localhost:3808864:3809707 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:789 -> 2 localhost:3808864:3809707 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1120 -> 2 localhost:3808864:3809707 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1169 -> 2 localhost:3808864:3809707 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1312 MCCL WARN [Proxy Service 7] Failed to execute operation Connect from rank 7, retcode 2 localhost:3808852:3809773 [1] MCCL INFO Connected all rings comm 0x7fd9b9c24010 nRanks 16 busId b000 localhost:3808851:3809710 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/ibvwrap.cc:309 MCCL WARN Call to ibv_reg_mr failed with error Invalid argument localhost:3808851:3809710 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net_ib.cc:1328 -> 2 localhost:3808851:3809710 [0] MCCL INFO /workspace/communication/mccl/src/include/net.h:53 -> 2 localhost:3808851:3809710 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:940 -> 2 localhost:3808851:3809710 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1120 -> 2 localhost:3808851:3809710 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1169 -> 2 localhost:3808851:3809710 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1312 MCCL WARN [Proxy Service 0] Failed to execute operation Connect from rank 0, retcode 2 localhost:1542596:1542902 [6] MCCL INFO Connected all rings comm 0x7efc82c24010 nRanks 16 busId 5b000 localhost:1542595:1542900 [5] MCCL INFO Connected all rings comm 0x7f93ef76a010 nRanks 16 busId 56000 localhost:1542593:1542895 [3] MCCL INFO Connected all rings comm 0x7f4964024010 nRanks 16 busId f000 localhost:1542591:1542899 [1] MCCL INFO Connected all rings comm 0x7f1e96c00010 nRanks 16 busId b000 localhost:1542592:1542897 [2] MCCL INFO Connected all rings comm 0x7fe15ae00010 nRanks 16 busId e000 localhost:1542594:1542896 [4] MCCL INFO Connected all rings comm 0x7fef53940010 nRanks 16 busId 55000 localhost:1542590:1542823 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/ibvwrap.cc:309 MCCL WARN Call to ibv_reg_mr failed with error Invalid argument localhost:1542590:1542823 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net_ib.cc:1328 -> 2 localhost:1542590:1542823 [0] MCCL INFO /workspace/communication/mccl/src/include/net.h:53 -> 2 localhost:1542590:1542823 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:940 -> 2 localhost:1542590:1542823 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1120 -> 2 localhost:1542590:1542823 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1169 -> 2 localhost:1542590:1542823 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1312 MCCL WARN [Proxy Service 8] Failed to execute operation Connect from rank 8, retcode 2 localhost:3808853:3809774 [2] MCCL INFO Connected all rings comm 0x7fc704e00010 nRanks 16 busId e000 localhost:3808855:3809772 [4] MCCL INFO Connected all rings comm 0x7fb633d40010 nRanks 16 busId 55000 localhost:3808858:3809775 [6] MCCL INFO Connected all rings comm 0x7fcf8a717010 nRanks 16 busId 5b000 localhost:3808854:3809769 [3] MCCL INFO Connected all rings comm 0x7f10c1b6a010 nRanks 16 busId f000 localhost:3808857:3809770 [5] MCCL INFO Connected all rings comm 0x7f3c2ee24010 nRanks 16 busId 56000 localhost:1542597:1542901 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:65 MCCL WARN socketProgress: Connection closed by remote peer localhost.localdomain<42591> localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:75 -> 6 localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:942 -> 6 localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:996 -> 6 localhost:1542597:1542901 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1000 MCCL WARN Proxy Call to rank 15 failed (Connect) localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:369 -> 6 localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:180 -> 6 localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:350 -> 6 localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:178 -> 6 localhost:1542597:1542901 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:72 -> 6 [Async thread] localhost:1542597:1542597 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:565 -> 3 localhost:1542597:1542597 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:673 -> 3 localhost:1542597:1542597 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/enqueue.cc:3648 -> 3 localhost: Test NCCL failure all_reduce.cu:56 'internal error / Proxy Call to rank 15 failed (Connect)'  .. localhost pid 1542597: Test failure common.cu:488  .. localhost pid 1542597: Test failure common.cu:830  .. localhost pid 1542597: Test failure all_reduce.cu:102  .. localhost pid 1542597: Test failure common.cu:926  .. localhost pid 1542597: Test failure common.cu:1515  .. localhost pid 1542597: Test failure common.cu:1271 localhost:1542597:1542597 [7] MCCL INFO comm 0x7f75cdd40010 rank 15 nranks 16 macaDev 7 busId 5e000 - Destroy COMPLETE localhost:1542597:1542597 [7] MCCL INFO MCCL destroyed the communicators localhost:3808864:3809771 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:65 MCCL WARN socketProgress: Connection closed by remote peer localhost.localdomain<35789> localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:75 -> 6 localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:942 -> 6 localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:996 -> 6 localhost:3808864:3809771 [7] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1000 MCCL WARN Proxy Call to rank 7 failed (Connect) localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:369 -> 6 localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:180 -> 6 localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:350 -> 6 localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:178 -> 6 localhost:3808864:3809771 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:72 -> 6 [Async thread] localhost:3808864:3808864 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:565 -> 3 localhost:3808864:3808864 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:673 -> 3 localhost:3808864:3808864 [7] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/enqueue.cc:3648 -> 3 localhost: Test NCCL failure all_reduce.cu:56 'internal error / Proxy Call to rank 7 failed (Connect)'  .. localhost pid 3808864: Test failure common.cu:488  .. localhost pid 3808864: Test failure common.cu:830  .. localhost pid 3808864: Test failure all_reduce.cu:102  .. localhost pid 3808864: Test failure common.cu:926  .. localhost pid 3808864: Test failure common.cu:1515  .. localhost pid 3808864: Test failure common.cu:1271 localhost:3808864:3808864 [7] MCCL INFO comm 0x7f14a6340010 rank 7 nranks 16 macaDev 7 busId 5e000 - Destroy COMPLETE localhost:3808864:3808864 [7] MCCL INFO MCCL destroyed the communicators localhost:3808851:3809776 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:65 MCCL WARN socketProgress: Connection closed by remote peer localhost.localdomain<50873> localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:75 -> 6 localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:942 -> 6 localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:996 -> 6 localhost:3808851:3809776 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1000 MCCL WARN Proxy Call to rank 0 failed (Connect) localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:427 -> 6 localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:196 -> 6 localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:350 -> 6 localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:178 -> 6 localhost:3808851:3809776 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:72 -> 6 [Async thread] localhost:3808851:3808851 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:565 -> 3 localhost:3808851:3808851 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:673 -> 3 localhost:3808851:3808851 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/enqueue.cc:3648 -> 3 localhost: Test NCCL failure all_reduce.cu:56 'internal error / Proxy Call to rank 0 failed (Connect)'  .. localhost pid 3808851: Test failure common.cu:488  .. localhost pid 3808851: Test failure common.cu:830  .. localhost pid 3808851: Test failure all_reduce.cu:102  .. localhost pid 3808851: Test failure common.cu:926  .. localhost pid 3808851: Test failure common.cu:1515  .. localhost pid 3808851: Test failure common.cu:1271 -------------------------------------------------------------------------- Primary job  terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- localhost:3808851:3808851 [0] MCCL INFO comm 0x7feed6140010 rank 0 nranks 16 macaDev 0 busId 5000 - Destroy COMPLETE localhost:3808851:3808851 [0] MCCL INFO MCCL destroyed the communicators localhost:1542590:1542898 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:65 MCCL WARN socketProgress: Connection closed by remote peer localhost.localdomain<58863> localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:75 -> 6 localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/misc/socket.cc:942 -> 6 localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:996 -> 6 localhost:1542590:1542898 [0] /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/proxy.cc:1000 MCCL WARN Proxy Call to rank 8 failed (Connect) localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport/net.cc:427 -> 6 localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:196 -> 6 localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/transport.cc:350 -> 6 localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:178 -> 6 localhost:1542590:1542898 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:72 -> 6 [Async thread] localhost:1542590:1542590 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:565 -> 3 localhost:1542590:1542590 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/group.cc:673 -> 3 localhost:1542590:1542590 [0] MCCL INFO /workspace/out/Release/build/linux/x86_64/mccl/macaify/src/enqueue.cc:3648 -> 3 localhost: Test NCCL failure all_reduce.cu:56 'internal error / Proxy Call to rank 8 failed (Connect)'  .. localhost pid 1542590: Test failure common.cu:488  .. localhost pid 1542590: Test failure common.cu:830  .. localhost pid 1542590: Test failure all_reduce.cu:102  .. localhost pid 1542590: Test failure common.cu:926  .. localhost pid 1542590: Test failure common.cu:1515  .. localhost pid 1542590: Test failure common.cu:1271 localhost:1542590:1542590 [0] MCCL INFO comm 0x7fced0b40010 rank 8 nranks 16 macaDev 0 busId 5000 - Destroy COMPLETE localhost:1542590:1542590 [0] MCCL INFO MCCL destroyed the communicators -------------------------------------------------------------------------- mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was:   Process name: [[52991,1],15]   Exit code:    3 -------------------------------------------------------------------------- 这是完整的输出日志