Lines Matching +full:clock +full:- +full:accuracy
1 perf-stat(1)
5 ----
6 perf-stat - Run a command and gather performance counter statistics
9 --------
11 'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command>
12 'perf stat' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>]
13 'perf stat' [-e <EVENT> | --event=EVENT] [-a] record [-o file] \-- <command> [<options>]
14 'perf stat' report [-i file]
17 -----------
23 -------
33 -e::
34 --event=::
37 - a symbolic event name (use 'perf list' to list all events)
39 - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
42 - a symbolic or raw PMU event followed by an optional colon
43 and a list of event modifiers, e.g., cpu-cycles:p. See the
44 linkperf:perf-list[1] man page for details on event modifiers.
46 - a symbolically formed event like 'pmu/param1=0x3,param2/' where
52 perf stat -A -a -e cpu/event,percore=1/,otherevent ...
54 - a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
67 -i::
68 --no-inherit::
70 -p::
71 --pid=<pid>::
74 -t::
75 --tid=<tid>::
78 -b::
79 --bpf-prog::
81 requiring root rights. bpftool-prog could be used to find program
84 # bpftool prog | head -n 1
87 # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000
96 --bpf-counters::
98 allows multiple perf-stat sessions that are counting the same metric (cycles,
101 "perf config stat.bpf-counter-events=<list_of_events>".
103 --bpf-attr-map::
104 With option "--bpf-counters", different perf-stat sessions share
106 Use "--bpf-attr-map" to specify the path of this pinned hashmap.
110 --pfm-events events::
112 including support for event filters. For example '--pfm-events
115 events cannot be mixed together. The latter must be used with the -e
116 option. The -e option and this one can be mixed and matched. Events
120 -a::
121 --all-cpus::
122 system-wide collection from all CPUs (default if no target is specified)
124 --no-scale::
127 -d::
128 --detailed::
131 -d: detailed events, L1 and LLC data cache
132 -d -d: more detailed events, dTLB and iTLB events
133 -d -d -d: very detailed events, adding prefetch events
135 -r::
136 --repeat=<n>::
139 -B::
140 --big-num::
142 Enabled by default. Use "--no-big-num" to disable.
143 Default setting can be changed with "perf config stat.big-num=false".
145 -C::
146 --cpu=::
148 comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
149 In per-thread mode, this option is ignored. The -a option is still necessary
150 to activate system-wide monitoring. Default is to count on all CPUs.
152 -A::
153 --no-aggr::
156 -n::
157 --null::
158 null run - Don't start any counters.
160 This can be useful to measure just elapsed wall-clock time - or to assess the
163 -v::
164 --verbose::
167 -x SEP::
168 --field-separator SEP::
169 print counts using a CSV-style output to make it easy to import directly into
172 --table:: Display time for each run (-r option), in a table format, e.g.:
174 $ perf stat --null -r 5 --table perf bench sched pipe
179 5.189 (-0.293) #
180 5.189 (-0.294) #
181 5.186 (-0.296) #
186 5.483 +- 0.198 seconds time elapsed ( +- 3.62% )
188 -G name::
189 --cgroup name::
191 in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
195 an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
198 use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
201 command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
203 --for-each-cgroup name::
206 effect that repeating -e option and -G option for each event x name. This option
207 cannot be used with -G/--cgroup option.
209 -o file::
210 --output file::
213 --append::
214 Append to the output file designated with the -o option. Ignored if -o is not specified.
216 --log-fd::
218 Log output to fd, instead of stderr. Complementary to --output, and mutually exclusive
219 with it. --append may be used here. Examples:
220 3>results perf stat --log-fd 3 \-- $cmd
221 3>>results perf stat --log-fd 3 --append \-- $cmd
223 --control=fifo:ctl-fifo[,ack-fifo]::
224 --control=fd:ctl-fd[,ack-fd]::
225 ctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows.
226 Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
228 --delay=-1 option. Optionally send control command completion ('ack\n') to ack-fd descriptor
237 test -p ${ctl_fifo} && unlink ${ctl_fifo}
242 test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
246 perf stat -D -1 -e cpu-cycles -a -I 1000 \
247 --control fd:${ctl_fd},${ctl_fd_ack} \
248 \-- sleep 30 &
251 sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
252 sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
254 exec {ctl_fd_ack}>&-
257 exec {ctl_fd}>&-
260 wait -n ${perf_pid}
264 --pre::
265 --post::
268 perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' \-- make -s -j64 O=defc…
270 -I msecs::
271 --interval-print msecs::
274 example: 'perf stat -I 1000 -e cycles -a sleep 5'
278 --interval-count times::
280 This option should be used together with "-I" option.
281 example: 'perf stat -I 1000 --interval-count 2 -e cycles -a'
283 --interval-clear::
286 --timeout msecs::
288 This option is not supported with the "-I" option.
289 example: 'perf stat --time 2000 -e cycles -a'
291 --metric-only::
293 Don't show any raw values. Not supported with --per-thread.
295 --per-socket::
296 Aggregate counts per processor socket for system-wide mode measurements. This
298 use --per-socket in addition to -a. (system-wide). The output includes the
302 --per-die::
303 Aggregate counts per processor die for system-wide mode measurements. This
305 use --per-die in addition to -a. (system-wide). The output includes the
309 --per-core::
310 Aggregate counts per physical processor for system-wide mode measurements. This
312 use --per-core in addition to -a. (system-wide). The output includes the
315 --per-thread::
316 Aggregate counts per monitored threads, when monitoring threads (-t option)
317 or processes (-p option).
319 --per-node::
320 Aggregate counts per NUMA nodes for system-wide mode measurements. This
322 mode, use --per-node in addition to -a. (system-wide).
324 -D msecs::
325 --delay msecs::
326 After starting the program, wait msecs before measuring (-1: start with events
330 -T::
331 --transaction::
335 --metric-no-group::
338 --metric-no-group option places events outside of groups and may
339 increase the chance of the event being scheduled - leading to more
340 accuracy. However, as events may not be scheduled together accuracy
341 for metrics like instructions per cycle can be lower - as both metrics
344 --metric-no-merge::
349 group is that the group may require multiplexing and so accuracy for a
352 may be used to increase accuracy in this case.
354 --quiet::
359 -----------
362 -o file::
363 --output file::
367 -----------
370 -i file::
371 --input file::
374 --per-socket::
375 Aggregate counts per processor socket for system-wide mode measurements.
377 --per-die::
378 Aggregate counts per processor die for system-wide mode measurements.
380 --per-core::
381 Aggregate counts per physical processor for system-wide mode measurements.
383 -M::
384 --metrics::
390 -A::
391 --no-aggr::
394 --topdown::
395 Print complete top-down metrics supported by the CPU. This allows to
408 mode like -I 1000, as the bottleneck of workloads can change often.
410 This enables --metric-only, unless overridden with --no-metric-only.
417 and -a (global monitoring) is needed, requiring root rights or
418 perf.perf_event_paranoid=-1.
430 --td-level::
431 Print the top-down statistics that equal to or lower than the input level.
432 It allows users to print the interested top-down metrics level instead of
433 the complete top-down metrics.
435 The availability of the top-down metrics level depends on the hardware. For
436 example, Ice Lake only supports L1 top-down metrics. The Sapphire Rapids
437 supports both L1 and L2 top-down metrics.
442 --no-merge::
455 --smi-cost::
461 The cost of SMI can be measured by (aperf - unhalted core cycles).
464 oriented analysis. --metric_only will be applied by default.
465 The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
467 Users who wants to get the actual value can apply --no-metric-only.
469 --all-kernel::
472 --all-user::
475 --percore-show-thread::
484 --summary::
485 Print summary for interval mode (-I).
487 --no-csv-summary::
489 This option must be used with -x and --summary.
492 'stat.no-csv-summary'.
494 $ perf config stat.no-csv-summary=true
497 --------
499 $ perf stat \-- make
503 83723.452481 task-clock:u (msec) # 1.004 CPUs utilized
504 0 context-switches:u # 0.000 K/sec
505 0 cpu-migrations:u # 0.000 K/sec
506 3,228,188 page-faults:u # 0.039 M/sec
510 2,078,861,393 branch-misses:u # 2.98% of all branches
518 -------
533 ----------
535 With -x, perf stat is able to output a not-quite-CSV format output
537 it is recommended to use a different character like -x \;
541 - optional usec time stamp in fractions of second (with -I xxx)
542 - optional CPU, core, or socket identifier
543 - optional number of logical CPUs aggregated
544 - counter value
545 - unit of the counter value or empty
546 - event name
547 - run time of counter
548 - percentage of measurement time the counter was running
549 - optional variance if multiple values are collected with -r
550 - optional metric value
551 - optional unit of metric
555 include::intel-hybrid.txt[]
558 --------
559 linkperf:perf-top[1], linkperf:perf-list[1]