Lines Matching +full:down +full:- +full:counters

1 perf-stat(1)
5 ----
6 perf-stat - Run a command and gather performance counter statistics
9 --------
11 'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command>
12 'perf stat' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>]
13 'perf stat' [-e <EVENT> | --event=EVENT] [-a] record [-o file] \-- <command> [<options>]
14 'perf stat' report [-i file]
17 -----------
23 -------
33 -e::
34 --event=::
37 - a symbolic event name (use 'perf list' to list all events)
39 - a raw PMU event in the form of rN where N is a hexadecimal value
44 - a symbolic or raw PMU event followed by an optional colon
45 and a list of event modifiers, e.g., cpu-cycles:p. See the
46 linkperf:perf-list[1] man page for details on event modifiers.
48 - a symbolically formed event like 'pmu/param1=0x3,param2/' where
54 perf stat -A -a -e cpu/event,percore=1/,otherevent ...
56 - a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
69 -i::
70 --no-inherit::
71 child tasks do not inherit counters
72 -p::
73 --pid=<pid>::
76 -t::
77 --tid=<tid>::
80 -b::
81 --bpf-prog::
83 requiring root rights. bpftool-prog could be used to find program
86 # bpftool prog | head -n 1
89 # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000
98 --bpf-counters::
100 allows multiple perf-stat sessions that are counting the same metric (cycles,
101 instructions, etc.) to share hardware counters.
103 "perf config stat.bpf-counter-events=<list_of_events>".
105 --bpf-attr-map::
106 With option "--bpf-counters", different perf-stat sessions share
108 Use "--bpf-attr-map" to specify the path of this pinned hashmap.
112 --pfm-events events::
114 including support for event filters. For example '--pfm-events
117 events cannot be mixed together. The latter must be used with the -e
118 option. The -e option and this one can be mixed and matched. Events
122 -a::
123 --all-cpus::
124 system-wide collection from all CPUs (default if no target is specified)
126 --no-scale::
129 -d::
130 --detailed::
133 -d: detailed events, L1 and LLC data cache
134 -d -d: more detailed events, dTLB and iTLB events
135 -d -d -d: very detailed events, adding prefetch events
137 -r::
138 --repeat=<n>::
141 -B::
142 --big-num::
144 Enabled by default. Use "--no-big-num" to disable.
145 Default setting can be changed with "perf config stat.big-num=false".
147 -C::
148 --cpu=::
150 comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
151 In per-thread mode, this option is ignored. The -a option is still necessary
152 to activate system-wide monitoring. Default is to count on all CPUs.
154 -A::
155 --no-aggr::
158 -n::
159 --null::
160 null run - Don't start any counters.
162 This can be useful to measure just elapsed wall-clock time - or to assess the
163 raw overhead of perf stat itself, without running any counters.
165 -v::
166 --verbose::
169 -x SEP::
170 --field-separator SEP::
171 print counts using a CSV-style output to make it easy to import directly into
174 --table:: Display time for each run (-r option), in a table format, e.g.:
176 $ perf stat --null -r 5 --table perf bench sched pipe
181 5.189 (-0.293) #
182 5.189 (-0.294) #
183 5.186 (-0.296) #
188 5.483 +- 0.198 seconds time elapsed ( +- 3.62% )
190 -G name::
191 --cgroup name::
193 in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
197 an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
200 use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
203 command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
205 --for-each-cgroup name::
208 effect that repeating -e option and -G option for each event x name. This option
209 cannot be used with -G/--cgroup option.
211 -o file::
212 --output file::
215 --append::
216 Append to the output file designated with the -o option. Ignored if -o is not specified.
218 --log-fd::
220 Log output to fd, instead of stderr. Complementary to --output, and mutually exclusive
221 with it. --append may be used here. Examples:
222 3>results perf stat --log-fd 3 \-- $cmd
223 3>>results perf stat --log-fd 3 --append \-- $cmd
225 --control=fifo:ctl-fifo[,ack-fifo]::
226 --control=fd:ctl-fd[,ack-fd]::
227 ctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows.
228 Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
230 --delay=-1 option. Optionally send control command completion ('ack\n') to ack-fd descriptor
239 test -p ${ctl_fifo} && unlink ${ctl_fifo}
244 test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
248 perf stat -D -1 -e cpu-cycles -a -I 1000 \
249 --control fd:${ctl_fd},${ctl_fd_ack} \
250 \-- sleep 30 &
253 sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
254 sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
256 exec {ctl_fd_ack}>&-
259 exec {ctl_fd}>&-
262 wait -n ${perf_pid}
266 --pre::
267 --post::
270 perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' \-- make -s -j64 O=defc…
272 -I msecs::
273 --interval-print msecs::
276 example: 'perf stat -I 1000 -e cycles -a sleep 5'
280 --interval-count times::
282 This option should be used together with "-I" option.
283 example: 'perf stat -I 1000 --interval-count 2 -e cycles -a'
285 --interval-clear::
288 --timeout msecs::
290 This option is not supported with the "-I" option.
291 example: 'perf stat --time 2000 -e cycles -a'
293 --metric-only::
295 Don't show any raw values. Not supported with --per-thread.
297 --per-socket::
298 Aggregate counts per processor socket for system-wide mode measurements. This
300 use --per-socket in addition to -a. (system-wide). The output includes the
304 --per-die::
305 Aggregate counts per processor die for system-wide mode measurements. This
307 use --per-die in addition to -a. (system-wide). The output includes the
311 --per-core::
312 Aggregate counts per physical processor for system-wide mode measurements. This
314 use --per-core in addition to -a. (system-wide). The output includes the
317 --per-thread::
318 Aggregate counts per monitored threads, when monitoring threads (-t option)
319 or processes (-p option).
321 --per-node::
322 Aggregate counts per NUMA nodes for system-wide mode measurements. This
324 mode, use --per-node in addition to -a. (system-wide).
326 -D msecs::
327 --delay msecs::
328 After starting the program, wait msecs before measuring (-1: start with events
332 -T::
333 --transaction::
337 --metric-no-group::
340 --metric-no-group option places events outside of groups and may
341 increase the chance of the event being scheduled - leading to more
343 for metrics like instructions per cycle can be lower - as both metrics
346 --metric-no-merge::
356 --quiet::
361 -----------
364 -o file::
365 --output file::
369 -----------
372 -i file::
373 --input file::
376 --per-socket::
377 Aggregate counts per processor socket for system-wide mode measurements.
379 --per-die::
380 Aggregate counts per processor die for system-wide mode measurements.
382 --per-core::
383 Aggregate counts per physical processor for system-wide mode measurements.
385 -M::
386 --metrics::
392 -A::
393 --no-aggr::
396 --topdown::
397 Print complete top-down metrics supported by the CPU. This allows to
399 by breaking the cycles consumed down into frontend bound, backend bound,
410 mode like -I 1000, as the bottleneck of workloads can change often.
412 This enables --metric-only, unless overridden with --no-metric-only.
417 The top down metrics are collected per core instead of per
419 and -a (global monitoring) is needed, requiring root rights or
420 perf.perf_event_paranoid=-1.
432 --td-level::
433 Print the top-down statistics that equal to or lower than the input level.
434 It allows users to print the interested top-down metrics level instead of
435 the complete top-down metrics.
437 The availability of the top-down metrics level depends on the hardware. For
438 example, Ice Lake only supports L1 top-down metrics. The Sapphire Rapids
439 supports both L1 and L2 top-down metrics.
444 --no-merge::
457 --hybrid-merge::
465 For non-hybrid events, it should be no effect.
467 --smi-cost::
471 freeze core counters on SMI.
473 The cost of SMI can be measured by (aperf - unhalted core cycles).
476 oriented analysis. --metric_only will be applied by default.
477 The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
479 Users who wants to get the actual value can apply --no-metric-only.
481 --all-kernel::
484 --all-user::
487 --percore-show-thread::
496 --summary::
497 Print summary for interval mode (-I).
499 --no-csv-summary::
501 This option must be used with -x and --summary.
504 'stat.no-csv-summary'.
506 $ perf config stat.no-csv-summary=true
508 --cputype::
513 --------
515 $ perf stat \-- make
519 83723.452481 task-clock:u (msec) # 1.004 CPUs utilized
520 0 context-switches:u # 0.000 K/sec
521 0 cpu-migrations:u # 0.000 K/sec
522 3,228,188 page-faults:u # 0.039 M/sec
526 2,078,861,393 branch-misses:u # 2.98% of all branches
534 -------
536 We always display the time the counters were enabled/alive:
549 ----------
551 With -x, perf stat is able to output a not-quite-CSV format output
553 it is recommended to use a different character like -x \;
557 - optional usec time stamp in fractions of second (with -I xxx)
558 - optional CPU, core, or socket identifier
559 - optional number of logical CPUs aggregated
560 - counter value
561 - unit of the counter value or empty
562 - event name
563 - run time of counter
564 - percentage of measurement time the counter was running
565 - optional variance if multiple values are collected with -r
566 - optional metric value
567 - optional unit of metric
571 include::intel-hybrid.txt[]
574 -----------
576 With -j, perf stat is able to print out a JSON format output
579 - timestamp : optional usec time stamp in fractions of second (with -I)
580 - optional aggregate options:
581 - core : core identifier (with --per-core)
582 - die : die identifier (with --per-die)
583 - socket : socket identifier (with --per-socket)
584 - node : node identifier (with --per-node)
585 - thread : thread identifier (with --per-thread)
586 - counter-value : counter value
587 - unit : unit of the counter value or empty
588 - event : event name
589 - variance : optional variance if multiple values are collected (with -r)
590 - runtime : run time of counter
591 - metric-value : optional metric value
592 - metric-unit : optional unit of metric
595 --------
596 linkperf:perf-top[1], linkperf:perf-list[1]