topdown.txt - OpenGrok cross reference for /Linux-v6.1/tools/perf/Documentation/topdown.txt

Lines Matching +full:down +full:- +full:counters
2 -----------------------------------
5 methodology to break down CPU pipeline execution into 4 bottlenecks:
10 Traditionally this was implemented by events in generic counters
13 perf stat --topdown implements this.
15 Full Top Down includes more levels that can break down the
24 fixed counters and do not require generic counters. This allows
27 % perf stat -a --topdown -I1000
64 metric event, and allow user programs to read the performance counters.
95 int slots_fd = perf_event_open(&slots, 0, -1, -1, 0);
115 int metrics_fd = perf_event_open(&metrics, 0, -1, slots_fd, 0);
134 #define RDPMC_FIXED	(1 << 30)	/* return fixed counters */
135 #define RDPMC_METRIC	(1 << 29)	/* return metric counters */
158 _rdpmc calls should not be mixed with reading the metrics and slots counters
159 through system calls, as the kernel will reset these counters after each system
216 	retiring_slots = GET_METRIC(metric_b, 0) * slots_b - retiring_slots_a
217 	bad_spec_slots = GET_METRIC(metric_b, 1) * slots_b - bad_spec_slots_a
218 	fe_bound_slots = GET_METRIC(metric_b, 2) * slots_b - fe_bound_slots_a
219 	be_bound_slots = GET_METRIC(metric_b, 3) * slots_b - be_bound_slots_a
224 	slots_delta = slots_b - slots_a
237 recreated from L1 and L2 metric counters. (Available on Sapphire Rapids and
247 	heavy_ops_slots = GET_METRIC(metric_b, 4) * slots_b - heavy_ops_slots_a
248 	br_mispredict_slots = GET_METRIC(metric_b, 5) * slots_b - br_mispredict_slots_a
249 	fetch_lat_slots = GET_METRIC(metric_b, 6) * slots_b - fetch_lat_slots_a
250 	mem_bound_slots = GET_METRIC(metric_b, 7) * slots_b - mem_bound_slots_a
252 	slots_delta = slots_b - slots_a
254 	light_ops_ratio = retiring_ratio - heavy_ops_ratio;
257 	machine_clears_ratio = bad_spec_ratio - br_mispredict_ratio;
260 	fetch_bw_ratio = fe_bound_ratio - fetch_lat_ratio;
263 	core_bound_ratio = be_bound_ratio - mem_bound_ratio;
278 Resetting metrics counters
283 fraction bit shrinks. So the counters need to be reset regularly.
289 When using perf stat it is recommended to always use the -I option,
292 	perf stat -I 1000 --topdown ...
307 Four pseudo TopDown metric events are exposed for the end-users,
308 topdown-retiring, topdown-bad-spec, topdown-fe-bound and topdown-be-bound.
311 - All the TopDown metric events must be in a group with the SLOTS event.
312 - The SLOTS event must be the leader of the group.
313 - The PERF_FORMAT_GROUP flag must be applied for each TopDown metric
319 For example, perf record -e '{slots, $sampling_event, topdown-retiring}:S'
325 The upper half is also divided into four 8-bit fields for the new level 2
326 metrics. Four more TopDown metric events are exposed for the end-users,
327 topdown-heavy-ops, topdown-br-mispredict, topdown-fetch-lat and
328 topdown-mem-bound.
334     Light_Operations = Retiring - Heavy_Operations
335     Machine_Clears = Bad_Speculation - Branch_Mispredicts
336     Fetch_Bandwidth = Frontend_Bound - Fetch_Latency
337     Core_Bound = Backend_Bound - Memory_Bound
340 [1] https://software.intel.com/en-us/top-down-microarchitecture-analysis-method-win
341 [2] https://github.com/andikleen/pmu-tools/wiki/toplev-manual
342 [3] https://software.intel.com/en-us/intel-vtune-amplifier-xe
343 [4] https://github.com/andikleen/pmu-tools/tree/master/jevents
344 [5] https://sites.google.com/site/analysismethods/yasin-pubs