during evaluation of memory performance of power8 processor using perf ended problem of understanding difference between events pm_data_all_* , pm_data_*. of counters exists in both version, description in oprofile documentation , in papi_native_avail same, example:
pm_data_from_lmem
the processor's data cache reloaded local chip's memory due either demand loads or demand loads plus prefetches if mmcr1[16] 1.
i though figure out difference measuring data. if provide task large enough, can observe expected difference *_all versions have higher values. understand concept of multiplexing counters in measure using perf.
so in these events?
after few more hours of searching, found source directly ibm describing events as:
pm_data_all_from_lmem
the processor's data cache reloaded local chip's memory due either demand loads or data prefetch
and
pm_data_from_lmem
the processor's data cache reloaded local chip's memory due demand load
so difference makes prefetch load, not included in second version.
the papi , perf tools include wrong description. these events contributed directly oprofile ibm mistakes/inaccuracies. browse through papi/libpfm source, see correct description in .pme_short_desc field, .pme_long_desc fields both same. , papi_native_avail reports long one: 
thanks patience. summing stuff helped me lot , hope struggling similar issues.
Comments
Post a Comment