mcProfiler的使用问题

link

jiangbin
Members 16 posts

2025年8月7日 15:52 2025年8月7日 15:52
link

Task Menu下面Ex cute Loop 0我这边看到的有在docker里面执行模型的命令，且模型执行成功。但是还是一样的报错。如果有防火墙，这个数据应该无法返回。我这边已经确认把gui-profiler加到百名店白名单里面

1.在server日志文件里面，我看到了模型推理过程日志以及结果的打印日志。mcRpcPort.ini: ['cat: /root/mcRpcPort.ini: No such file or directory']这个是需要手动修改还是需要额外启动RPC服务？如何启动

[warning] cannot get rpc server port, try more times:cannot retrieve rpc server port.
mcRpcPort.ini: ['cat: /root/mcRpcPort.ini: No such file or directory']
[warning] cannot get rpc server port, try more times:cannot retrieve rpc server port.
mcToolsExtPid_UsedPort:10466-46153
mcRpcPort.ini: ['10466,46153,python3.10']
[info] mctool client port:46153
INFO 08-07 15:32:15 [parallel_state.py:1004] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0
INFO 08-07 15:32:15 [cuda.py:204] Using FlashMLA backend on V1 engine.
WARNING 08-07 15:32:15 [topk_topp_sampler.py:69] FlashInfer is not available. Falling back to the PyTorch-native implementation of top-p & top-k sampling. For the best performance, please install FlashInfer.
INFO 08-07 15:32:16 [gpu_model_runner.py:1329] Starting to load model /home/weight/DeepSeek-V2-Lite...
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:04<00:12, 4.27s/it]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:07<00:07, 3.54s/it]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:11<00:04, 4.02s/it]
[error] cannot connect to server:
[error] execute failed:
[error] Traceback (most recent call last):
File "prpc_client\mctool_client.py", line 36, in init
File "grpc_utilities.py", line 151, in result
File "grpc_utilities.py", line 97, in _block
grpc.FutureTimeoutError

麻烦再帮看看，谢谢。3
link

jiangbin
Members 16 posts

2025年8月7日 16:29 2025年8月7日 16:29
link

相同的日志里面，server里面打印的模型执行信息：
*warning] cannot get rpc server port, try more times:cannot retrieve rpc server port.
mcToolsExtPid_UsedPort:11989-33853
mcRpcPort.ini: ['11989,33853,python3.10']
[info] mctool client port:33853*
INFO 08-07 15:52:12 [parallel_state.py:1004] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0
INFO 08-07 15:52:12 [cuda.py:204] Using FlashMLA backend on V1 engine.
WARNING 08-07 15:52:12 [topk_topp_sampler.py:69] FlashInfer is not available. Falling back to the PyTorch-native implementation of top-p & top-k sampling. For the best performance, please install FlashInfer.
INFO 08-07 15:52:13 [gpu_model_runner.py:1329] Starting to load model /home/weight/DeepSeek-V2-Lite...
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:04<00:13, 4.48s/it]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:07<00:07, 3.74s/it]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:12<00:04, 4.24s/it]强调文本
link

shuai_chen
Members 221 posts

2025年8月7日 17:52 2025年8月7日 17:52
link

@jiangbin has written:

Task Menu下面Ex cute Loop 0我这边看到的有在docker里面执行模型的命令，且模型执行成功。但是还是一样的报错。如果有防火墙，这个数据应该无法返回。我这边已经确认把gui-profiler加到百名店白名单里面

1.在server日志文件里面，我看到了模型推理过程日志以及结果的打印日志。mcRpcPort.ini: ['cat: /root/mcRpcPort.ini: No such file or directory']这个是需要手动修改还是需要额外启动RPC服务？如何启动

[warning] cannot get rpc server port, try more times:cannot retrieve rpc server port.
mcRpcPort.ini: ['cat: /root/mcRpcPort.ini: No such file or directory']
[warning] cannot get rpc server port, try more times:cannot retrieve rpc server port.
mcToolsExtPid_UsedPort:10466-46153
mcRpcPort.ini: ['10466,46153,python3.10']
[info] mctool client port:46153
INFO 08-07 15:32:15 [parallel_state.py:1004] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0
INFO 08-07 15:32:15 [cuda.py:204] Using FlashMLA backend on V1 engine.
WARNING 08-07 15:32:15 [topk_topp_sampler.py:69] FlashInfer is not available. Falling back to the PyTorch-native implementation of top-p & top-k sampling. For the best performance, please install FlashInfer.
INFO 08-07 15:32:16 [gpu_model_runner.py:1329] Starting to load model /home/weight/DeepSeek-V2-Lite...
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:04<00:12, 4.27s/it]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:07<00:07, 3.54s/it]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:11<00:04, 4.02s/it]
[error] cannot connect to server:
[error] execute failed:
[error] Traceback (most recent call last):
File "prpc_client\mctool_client.py", line 36, in init
File "grpc_utilities.py", line 151, in result
File "grpc_utilities.py", line 97, in _block
grpc.FutureTimeoutError

麻烦再帮看看，谢谢。3

尊敬的开发者您好，麻烦提供windows配置截图
link

jiangbin
Members 16 posts

2025年8月7日 17:55 2025年8月7日 17:55
link

image.png
PNG, 233.8 KB, uploaded by jiangbin on 2025年8月7日.
link

shuai_chen
Members 221 posts

2025年8月7日 17:56 2025年8月7日 17:56
link

@shuai_chen has written:

尊敬的开发者您好，麻烦提供windows配置截图

尊敬的开发者您好，麻烦提供完整的windows配置截图
link

jiangbin
Members 16 posts

2025年8月7日 17:58 2025年8月7日 17:58
link

这个就是gui-profiler的配置内容，还需要什么？
link

shuai_chen
Members 221 posts

2025年8月7日 18:14 2025年8月7日 18:14
link

@jiangbin has written:

这个就是gui-profiler的配置内容，还需要什么？

尊敬的开发者您好，麻烦您参考下面两张图

image.png
PNG, 279.9 KB, uploaded by shuai_chen on 2025年8月7日.

image.png
PNG, 214.2 KB, uploaded by shuai_chen on 2025年8月7日.
link

jiangbin
Members 16 posts

2025年8月15日 17:56 2025年8月15日 17:56
link

你好，麻烦问一下有算子级别的性能分析工具么？类似于nv的nsight_compute工具。
link

shuai_chen
Members 221 posts

2025年8月15日 17:59 2025年8月15日 17:59
link

@jiangbin has written:

你好，麻烦问一下有算子级别的性能分析工具么？类似于nv的nsight_compute工具。

尊敬的开发者您好，
mcTracer主要抓取MXMACA应用程序的活动事件，并将活动事件按照时间序列进行展示。
mcTracer分为两部分，分别是用于采集数据的采样端程序 mcTracer，以及用于显示活动序列的 mcTracer-Viewer。
支持抓取的数据类型包括：
MXMACA Runtime API，主要是MXMACA软件栈提供的API活动
MXMACA Kernel，运行在曦云系列GPU上的核函数
MCTX，自定义标签段数据
参考链接：developer.metax-tech.com/api/client/document/preview/548/C500_mcTracerManual_CN.html
link

jiangbin
Members 16 posts

2025年8月15日 18:00 2025年8月15日 18:00
link

我使用了mcTracer，算子具体的如何看？只是看到了算子kernel的时间。
link

shuai_chen
Members 221 posts

2025年8月15日 18:11 2025年8月15日 18:11
link

尊敬的开发者您好，请根据算子kernel时间以及mcTracer其他性能数据进行分析
link

jiangbin
Members 16 posts

2025年8月15日 18:14 2025年8月15日 18:14
link

这个kernel时间可以更具体细化到内部芯片执行么？类似于数据搬运，以及执行。
link

shuai_chen
Members 221 posts

2025年8月15日 18:15 2025年8月15日 18:15
link

尊敬的开发者您好，请查看mcTracer文档
link

jiangbin
Members 16 posts

2025年8月21日 11:15 2025年8月21日 11:15
link

你好，麻烦请教一下gitee.com/metax-maca/mxmaca-performance-tuning-guide/blob/main/guide/ch6.Kernel%E6%80%A7%E8%83%BD%E5%88%86%E6%9E%90%E5%B7%A5%E5%85%B7.md，这里涉及到的算子性能分析工具，在哪里可以下载。论坛下载地方都查看了没有找到，谢谢。

image.png
PNG, 189.2 KB, uploaded by jiangbin on 2025年8月21日.
link

shuai_chen
Members 221 posts

2025年8月21日 11:25 2025年8月21日 11:25
link

尊敬的开发者您好，请在此仓库里面提issue
link

jiangbin
Members 16 posts

2025年8月21日 11:52 2025年8月21日 11:52
link

gitee.com/metax-maca/mxmaca-performance-tuning-guide/issues 已经提了，麻烦帮回复，谢谢。
arrow_forward
Thread has been moved from 信息沟通.
- By shuai_chen on 2025年9月9日 16:19.
link

inkstone
Members 7 posts

2025年9月21日 11:53 2025年9月21日 11:53
link

"perf counter"是mcProfiling工具提供，看前面帖子，依然未使用成功？
“Cycle Trace”使用起来更复杂，目前需通过商务渠道定向申请获取。
link

inkstone
Members 7 posts

2025年9月21日 11:59 2025年9月21日 11:59
link

Command Line: 要执行的程序命令及参数
Case Name: 本次任务的名字

image.png
PNG, 139.1 KB, uploaded by inkstone on 2025年9月21日.