during evaluation of memory performance of power8 processor using perf
ended problem of understanding difference between events pm_data_all_*
, pm_data_*
. of counters exists in both version, description in oprofile documentation , in papi_native_avail
same, example:
pm_data_from_lmem
the processor's data cache reloaded local chip's memory due either demand loads or demand loads plus prefetches if mmcr1[16] 1.
i though figure out difference measuring data. if provide task large enough, can observe expected difference *_all
versions have higher values. understand concept of multiplexing counters in measure using perf
.
so in these events?
after few more hours of searching, found source directly ibm describing events as:
pm_data_all_from_lmem
the processor's data cache reloaded local chip's memory due either demand loads or data prefetch
and
pm_data_from_lmem
the processor's data cache reloaded local chip's memory due demand load
so difference makes prefetch load, not included in second version.
the papi , perf tools include wrong description. these events contributed directly oprofile
ibm mistakes/inaccuracies. browse through papi/libpfm source, see correct description in .pme_short_desc
field, .pme_long_desc
fields both same. , papi_native_avail
reports long one:
thanks patience. summing stuff helped me lot , hope struggling similar issues.
Comments
Post a Comment