perf-stat.txt - OpenGrok cross reference for /Linux-v5.15/tools/perf/Documentation/perf-stat.txt

Lines Matching +full:clock +full:- +full:accuracy
1 perf-stat(1)
5 ----
6 perf-stat - Run a command and gather performance counter statistics
9 --------
11 'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command>
12 'perf stat' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>]
13 'perf stat' [-e <EVENT> | --event=EVENT] [-a] record [-o file] \-- <command> [<options>]
14 'perf stat' report [-i file]
17 -----------
23 -------
33 -e::
34 --event=::
37 	- a symbolic event name (use 'perf list' to list all events)
39 	- a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
42         - a symbolic or raw PMU event followed by an optional colon
43 	  and a list of event modifiers, e.g., cpu-cycles:p.  See the
44 	  linkperf:perf-list[1] man page for details on event modifiers.
46 	- a symbolically formed event like 'pmu/param1=0x3,param2/' where
52 	  perf stat -A -a -e cpu/event,percore=1/,otherevent ...
54 	- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
67 -i::
68 --no-inherit::
70 -p::
71 --pid=<pid>::
74 -t::
75 --tid=<tid>::
78 -b::
79 --bpf-prog::
81         requiring root rights. bpftool-prog could be used to find program
84   # bpftool prog | head -n 1
87   # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000
96 --bpf-counters::
98 	allows multiple perf-stat sessions that are counting the same metric (cycles,
101 	"perf config stat.bpf-counter-events=<list_of_events>".
103 --bpf-attr-map::
104 	With option "--bpf-counters", different perf-stat sessions share
106 	Use "--bpf-attr-map" to specify the path of this pinned hashmap.
110 --pfm-events events::
112 including support for event filters. For example '--pfm-events
115 events cannot be mixed together. The latter must be used with the -e
116 option. The -e option and this one can be mixed and matched.  Events
120 -a::
121 --all-cpus::
122         system-wide collection from all CPUs (default if no target is specified)
124 --no-scale::
127 -d::
128 --detailed::
131 	   -d:          detailed events, L1 and LLC data cache
132         -d -d:     more detailed events, dTLB and iTLB events
133      -d -d -d:     very detailed events, adding prefetch events
135 -r::
136 --repeat=<n>::
139 -B::
140 --big-num::
142 	Enabled by default. Use "--no-big-num" to disable.
143 	Default setting can be changed with "perf config stat.big-num=false".
145 -C::
146 --cpu=::
148 comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
149 In per-thread mode, this option is ignored. The -a option is still necessary
150 to activate system-wide monitoring. Default is to count on all CPUs.
152 -A::
153 --no-aggr::
156 -n::
157 --null::
158 null run - Don't start any counters.
160 This can be useful to measure just elapsed wall-clock time - or to assess the
163 -v::
164 --verbose::
167 -x SEP::
168 --field-separator SEP::
169 print counts using a CSV-style output to make it easy to import directly into
172 --table:: Display time for each run (-r option), in a table format, e.g.:
174   $ perf stat --null -r 5 --table perf bench sched pipe
179              5.189 (-0.293) #
180              5.189 (-0.294) #
181              5.186 (-0.296) #
186              5.483 +- 0.198 seconds time elapsed  ( +-  3.62% )
188 -G name::
189 --cgroup name::
191 in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
195 an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
198 use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
201 command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
203 --for-each-cgroup name::
206 effect that repeating -e option and -G option for each event x name.  This option
207 cannot be used with -G/--cgroup option.
209 -o file::
210 --output file::
213 --append::
214 Append to the output file designated with the -o option. Ignored if -o is not specified.
216 --log-fd::
218 Log output to fd, instead of stderr.  Complementary to --output, and mutually exclusive
219 with it.  --append may be used here.  Examples:
220      3>results  perf stat --log-fd 3          \-- $cmd
221      3>>results perf stat --log-fd 3 --append \-- $cmd
223 --control=fifo:ctl-fifo[,ack-fifo]::
224 --control=fd:ctl-fd[,ack-fd]::
225 ctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows.
226 Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
228 --delay=-1 option. Optionally send control command completion ('ack\n') to ack-fd descriptor
237  test -p ${ctl_fifo} && unlink ${ctl_fifo}
242  test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
246  perf stat -D -1 -e cpu-cycles -a -I 1000       \
247            --control fd:${ctl_fd},${ctl_fd_ack} \
248            \-- sleep 30 &
251  sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
252  sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
254  exec {ctl_fd_ack}>&-
257  exec {ctl_fd}>&-
260  wait -n ${perf_pid}
264 --pre::
265 --post::
268 perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' \-- make -s -j64 O=defc…
270 -I msecs::
271 --interval-print msecs::
274 	example: 'perf stat -I 1000 -e cycles -a sleep 5'
278 --interval-count times::
280 This option should be used together with "-I" option.
281 	example: 'perf stat -I 1000 --interval-count 2 -e cycles -a'
283 --interval-clear::
286 --timeout msecs::
288 This option is not supported with the "-I" option.
289 	example: 'perf stat --time 2000 -e cycles -a'
291 --metric-only::
293 Don't show any raw values. Not supported with --per-thread.
295 --per-socket::
296 Aggregate counts per processor socket for system-wide mode measurements.  This
298 use --per-socket in addition to -a. (system-wide).  The output includes the
302 --per-die::
303 Aggregate counts per processor die for system-wide mode measurements.  This
305 use --per-die in addition to -a. (system-wide).  The output includes the
309 --per-core::
310 Aggregate counts per physical processor for system-wide mode measurements.  This
312 use --per-core in addition to -a. (system-wide).  The output includes the
315 --per-thread::
316 Aggregate counts per monitored threads, when monitoring threads (-t option)
317 or processes (-p option).
319 --per-node::
320 Aggregate counts per NUMA nodes for system-wide mode measurements. This
322 mode, use --per-node in addition to -a. (system-wide).
324 -D msecs::
325 --delay msecs::
326 After starting the program, wait msecs before measuring (-1: start with events
330 -T::
331 --transaction::
335 --metric-no-group::
338 --metric-no-group option places events outside of groups and may
339 increase the chance of the event being scheduled - leading to more
340 accuracy. However, as events may not be scheduled together accuracy
341 for metrics like instructions per cycle can be lower - as both metrics
344 --metric-no-merge::
349 group is that the group may require multiplexing and so accuracy for a
352 may be used to increase accuracy in this case.
354 --quiet::
359 -----------
362 -o file::
363 --output file::
367 -----------
370 -i file::
371 --input file::
374 --per-socket::
375 Aggregate counts per processor socket for system-wide mode measurements.
377 --per-die::
378 Aggregate counts per processor die for system-wide mode measurements.
380 --per-core::
381 Aggregate counts per physical processor for system-wide mode measurements.
383 -M::
384 --metrics::
390 -A::
391 --no-aggr::
394 --topdown::
395 Print complete top-down metrics supported by the CPU. This allows to
408 mode like -I 1000, as the bottleneck of workloads can change often.
410 This enables --metric-only, unless overridden with --no-metric-only.
417 and -a (global monitoring) is needed, requiring root rights or
418 perf.perf_event_paranoid=-1.
430 --td-level::
431 Print the top-down statistics that equal to or lower than the input level.
432 It allows users to print the interested top-down metrics level instead of
433 the complete top-down metrics.
435 The availability of the top-down metrics level depends on the hardware. For
436 example, Ice Lake only supports L1 top-down metrics. The Sapphire Rapids
437 supports both L1 and L2 top-down metrics.
442 --no-merge::
455 --smi-cost::
461 The cost of SMI can be measured by (aperf - unhalted core cycles).
464 oriented analysis. --metric_only will be applied by default.
465 The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
467 Users who wants to get the actual value can apply --no-metric-only.
469 --all-kernel::
472 --all-user::
475 --percore-show-thread::
484 --summary::
485 Print summary for interval mode (-I).
487 --no-csv-summary::
489 This option must be used with -x and --summary.
492 'stat.no-csv-summary'.
494 $ perf config stat.no-csv-summary=true
497 --------
499 $ perf stat \-- make
503         83723.452481      task-clock:u (msec)       #    1.004 CPUs utilized
504                    0      context-switches:u        #    0.000 K/sec
505                    0      cpu-migrations:u          #    0.000 K/sec
506            3,228,188      page-faults:u             #    0.039 M/sec
510        2,078,861,393      branch-misses:u           #    2.98% of all branches
518 -------
533 ----------
535 With -x, perf stat is able to output a not-quite-CSV format output
537 it is recommended to use a different character like -x \;
541 	- optional usec time stamp in fractions of second (with -I xxx)
542 	- optional CPU, core, or socket identifier
543 	- optional number of logical CPUs aggregated
544 	- counter value
545 	- unit of the counter value or empty
546 	- event name
547 	- run time of counter
548 	- percentage of measurement time the counter was running
549 	- optional variance if multiple values are collected with -r
550 	- optional metric value
551 	- optional unit of metric
555 include::intel-hybrid.txt[]
558 --------
559 linkperf:perf-top[1], linkperf:perf-list[1]