diag memory test说明

[复制链接]
52 0

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有账号?立即注册

x
本帖最后由 funcY 于 2024-12-27 13:46 编辑

Individual Test Descriptions

                               
登录/注册后可看大图

                               
登录/注册后可看大图
Test 0 [Address test, walking 1 bit]
This test changes one bit at a time inmemory to see if it goes to a different memory location.
初始地址不断偏移得到新地址,bit为1的位置在32bit里移动得到一个序列,把这个序列写入新地址中,然后读取检查。
可以检测NPSF和CFs:
相邻图形敏感故障(NeighborhoodPattern Sensitive Faults,简称NPSF),一个存储单元的内容或者改变这个单元内容的能力受另一个存储单元内容的影响。
耦合故障(Coupling Faults,简称CF),一个存储单元的值发生改变,导致另一个存储单元的值发生改变。

Test 1 [Address test, own address]
Each address is written with its ownaddress and then is checked for consistency.
把地址的值写入对应地址的内存中,然后检查。可以检查是否有地址无法访问。

Test 2 [Moving inversions,ones&zeros]
This test uses the moving inversionsalgorithm with patterns of all ones and zeros.
This test does not take long and shouldquickly find all "hard" errors and some more subtle errors.
全部是1的序列写入,读出比较;全部是0的序列写入,读出比较。
inversions 的体现 :P1 = 全1; P2 = 全0;P2=~P1。(P=pattern)
可以检测SAF:
固定型故障(Stuck-At Faults,简称SAF),存储单元中的值固定为0(简记为SA0,Stuck-At-0)或者1(简记为SA1,Stuck-At-1),无法发生改变。

Test 3 [Moving inversions, 8 bitpattern]
This is the same as test 0 but uses a 8bit wide pattern of "walking" ones and zeros.
This test will better detect subtleerrors in "wide" memory chips.
更细粒度的walking 1 bitbit为1的位置在8bit里移动得到一个子序列,四个子序列构成32bit的序列。

Test 4 [Moving inversions, randompattern]
This test uses the same algorithm astest 3 but the data pattern is a random number.
This test is particularly effective infinding difficult to detect data sensitive errors.
随机生成序列,写入与读出比较,数据敏感型测试用例。

Test 5 [Block move, 64 moves]
This test moves blocks of memory. Memoryis initialized with shifting patterns that are inverted every 8 bytes.
Then these blocks of memory are movedaround. After the moves are completed the data patterns are checked.
构造特定的序列1010101001010101(inverted),把这段序列依次写入连续的内存中,然后读出做比较。

Test 6 [Moving inversions, 32 bitpattern]
This is a variation of the movinginversions algorithm that shifts the data pattern left one bit for eachsuccessive address.
The starting bit position is shiftedleft for each pass. To use all possible data patterns 32 passes are required.
This test is quite effective atdetecting data sensitive errors but the execution time is long.
在一段内存中,会多次做写入读出的比较,写入的序列为初始序列和初始序列不断左移(低位用指定的sval=0或1补齐)生成的序列。
测试短时间内反复读写同一位置,是否会有数据异常。

Test 7 [Random number sequence]
This test writes a series of randomnumbers into memory(1MB).
The initial pattern is checked and thencomplemented and checked again on the next pass.
However, unlike the moving inversionstest writing and checking can only be done in the forward direction.
检测连续顺序写是否正常。

Test 8 [Modulo 20, random pattern]
Using the Modulo-X algorithm shoulduncover errors that are not detected by moving inversions
due to cache and buffering interferencewith the algorithm.
生成随机序列P1,P2=~P1,在内存中全部写入P2,然后在X,2X....nX的位置写入P1。

检测缓存对数据正确性是否有影响。
Test 9 [Bit fade test, 2 patterns]
The bit fade test initializes all ofmemory with a pattern and then sleeps for 1 minute.
Then memory is examined to see if anymemory bits have changed.
长效性测试,检测是否会发生位衰减。

Test10 [Memory stress]
A random pattern is generated and alarge kernel is launched to set all memory to the pattern.
A new read and write kernel is launchedimmediately after the previous write kernel to check
if there is any errors in memory and setthe memory to the compliment.
This process is repeated for 1000 timesfor one pattern.
The kernel is written as to achieve themaximum bandwidth between the global memory and GPU.
压力测试。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

快速回复 返回顶部 返回列表