Lines Matching +full:monitor +full:- +full:interval +full:- +full:ms
1 perf-stat(1)
5 ----
6 perf-stat - Run a command and gather performance counter statistics
9 --------
11 'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command>
12 'perf stat' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>]
13 'perf stat' [-e <EVENT> | --event=EVENT] [-a] record [-o file] \-- <command> [<options>]
14 'perf stat' report [-i file]
17 -----------
23 -------
33 -e::
34 --event=::
37 - a symbolic event name (use 'perf list' to list all events)
39 - a raw PMU event in the form of rN where N is a hexadecimal value
44 - a symbolic or raw PMU event followed by an optional colon
45 and a list of event modifiers, e.g., cpu-cycles:p. See the
46 linkperf:perf-list[1] man page for details on event modifiers.
48 - a symbolically formed event like 'pmu/param1=0x3,param2/' where
54 perf stat -A -a -e cpu/event,percore=1/,otherevent ...
56 - a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
69 -i::
70 --no-inherit::
72 -p::
73 --pid=<pid>::
76 -t::
77 --tid=<tid>::
80 -b::
81 --bpf-prog::
83 requiring root rights. bpftool-prog could be used to find program
86 # bpftool prog | head -n 1
89 # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000
98 --bpf-counters::
100 allows multiple perf-stat sessions that are counting the same metric (cycles,
103 "perf config stat.bpf-counter-events=<list_of_events>".
105 --bpf-attr-map::
106 With option "--bpf-counters", different perf-stat sessions share
108 Use "--bpf-attr-map" to specify the path of this pinned hashmap.
112 --pfm-events events::
114 including support for event filters. For example '--pfm-events
117 events cannot be mixed together. The latter must be used with the -e
118 option. The -e option and this one can be mixed and matched. Events
122 -a::
123 --all-cpus::
124 system-wide collection from all CPUs (default if no target is specified)
126 --no-scale::
129 -d::
130 --detailed::
133 -d: detailed events, L1 and LLC data cache
134 -d -d: more detailed events, dTLB and iTLB events
135 -d -d -d: very detailed events, adding prefetch events
137 -r::
138 --repeat=<n>::
141 -B::
142 --big-num::
144 Enabled by default. Use "--no-big-num" to disable.
145 Default setting can be changed with "perf config stat.big-num=false".
147 -C::
148 --cpu=::
150 comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
151 In per-thread mode, this option is ignored. The -a option is still necessary
152 to activate system-wide monitoring. Default is to count on all CPUs.
154 -A::
155 --no-aggr::
158 -n::
159 --null::
160 null run - Don't start any counters.
162 This can be useful to measure just elapsed wall-clock time - or to assess the
165 -v::
166 --verbose::
169 -x SEP::
170 --field-separator SEP::
171 print counts using a CSV-style output to make it easy to import directly into
174 --table:: Display time for each run (-r option), in a table format, e.g.:
176 $ perf stat --null -r 5 --table perf bench sched pipe
181 5.189 (-0.293) #
182 5.189 (-0.294) #
183 5.186 (-0.296) #
188 5.483 +- 0.198 seconds time elapsed ( +- 3.62% )
190 -G name::
191 --cgroup name::
192 monitor only in the container (cgroup) called "name". This option is available only
193 in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
197 an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
200 use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
202 If wanting to monitor, say, 'cycles' for a cgroup and also for system wide, this
203 command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
205 --for-each-cgroup name::
208 effect that repeating -e option and -G option for each event x name. This option
209 cannot be used with -G/--cgroup option.
211 -o file::
212 --output file::
215 --append::
216 Append to the output file designated with the -o option. Ignored if -o is not specified.
218 --log-fd::
220 Log output to fd, instead of stderr. Complementary to --output, and mutually exclusive
221 with it. --append may be used here. Examples:
222 3>results perf stat --log-fd 3 \-- $cmd
223 3>>results perf stat --log-fd 3 --append \-- $cmd
225 --control=fifo:ctl-fifo[,ack-fifo]::
226 --control=fd:ctl-fd[,ack-fd]::
227 ctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows.
228 Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
230 --delay=-1 option. Optionally send control command completion ('ack\n') to ack-fd descriptor
239 test -p ${ctl_fifo} && unlink ${ctl_fifo}
244 test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
248 perf stat -D -1 -e cpu-cycles -a -I 1000 \
249 --control fd:${ctl_fd},${ctl_fd_ack} \
250 \-- sleep 30 &
253 sleep 5 && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
254 sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
256 exec {ctl_fd_ack}>&-
259 exec {ctl_fd}>&-
262 wait -n ${perf_pid}
266 --pre::
267 --post::
270 perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' \-- make -s -j64 O=defc…
272 -I msecs::
273 --interval-print msecs::
274 Print count deltas every N milliseconds (minimum: 1ms)
275 The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. …
276 example: 'perf stat -I 1000 -e cycles -a sleep 5'
278 If the metric exists, it is calculated by the counts generated in this interval and the metric is p…
280 --interval-count times::
282 This option should be used together with "-I" option.
283 example: 'perf stat -I 1000 --interval-count 2 -e cycles -a'
285 --interval-clear::
286 Clear the screen before next interval.
288 --timeout msecs::
289 Stop the 'perf stat' session and print count deltas after N milliseconds (minimum: 10 ms).
290 This option is not supported with the "-I" option.
291 example: 'perf stat --time 2000 -e cycles -a'
293 --metric-only::
295 Don't show any raw values. Not supported with --per-thread.
297 --per-socket::
298 Aggregate counts per processor socket for system-wide mode measurements. This
300 use --per-socket in addition to -a. (system-wide). The output includes the
304 --per-die::
305 Aggregate counts per processor die for system-wide mode measurements. This
307 use --per-die in addition to -a. (system-wide). The output includes the
311 --per-cache::
312 Aggregate counts per cache instance for system-wide mode measurements. By
315 alongside the option in the format [Ll][1-9][0-9]*. For example:
316 Using option "--per-cache=l3" or "--per-cache=L3" will aggregate the
319 --per-core::
320 Aggregate counts per physical processor for system-wide mode measurements. This
322 use --per-core in addition to -a. (system-wide). The output includes the
325 --per-thread::
326 Aggregate counts per monitored threads, when monitoring threads (-t option)
327 or processes (-p option).
329 --per-node::
330 Aggregate counts per NUMA nodes for system-wide mode measurements. This
332 mode, use --per-node in addition to -a. (system-wide).
334 -D msecs::
335 --delay msecs::
336 After starting the program, wait msecs before measuring (-1: start with events
340 -T::
341 --transaction::
345 --metric-no-group::
348 --metric-no-group option places events outside of groups and may
349 increase the chance of the event being scheduled - leading to more
351 for metrics like instructions per cycle can be lower - as both metrics
354 --metric-no-merge::
364 --metric-no-threshold::
373 --quiet::
378 -----------
381 -o file::
382 --output file::
386 -----------
389 -i file::
390 --input file::
393 --per-socket::
394 Aggregate counts per processor socket for system-wide mode measurements.
396 --per-die::
397 Aggregate counts per processor die for system-wide mode measurements.
399 --per-cache::
400 Aggregate counts per cache instance for system-wide mode measurements. By
403 alongside the option in the format [Ll][1-9][0-9]*. For example: Using
404 option "--per-cache=l3" or "--per-cache=L3" will aggregate the
407 --per-core::
408 Aggregate counts per physical processor for system-wide mode measurements.
410 -M::
411 --metrics::
423 -A::
424 --no-aggr::
427 --topdown::
428 Print top-down metrics supported by the CPU. This allows to determine
440 For best results it is usually a good idea to use it with interval
441 mode like -I 1000, as the bottleneck of workloads can change often.
443 This enables --metric-only, unless overridden with --no-metric-only.
450 and -a (global monitoring) is needed, requiring root rights or
451 perf.perf_event_paranoid=-1.
463 --td-level::
464 Print the top-down statistics that equal the input level. It allows
465 users to print the interested top-down metrics level instead of the
466 level 1 top-down metrics.
474 'perf stat -M tma_frontend_bound_group...'.
478 --no-merge::
491 --hybrid-merge::
499 For non-hybrid events, it should be no effect.
501 --smi-cost::
507 The cost of SMI can be measured by (aperf - unhalted core cycles).
510 oriented analysis. --metric_only will be applied by default.
511 The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
513 Users who wants to get the actual value can apply --no-metric-only.
515 --all-kernel::
518 --all-user::
521 --percore-show-thread::
530 --summary::
531 Print summary for interval mode (-I).
533 --no-csv-summary::
535 This option must be used with -x and --summary.
538 'stat.no-csv-summary'.
540 $ perf config stat.no-csv-summary=true
542 --cputype::
547 --------
549 $ perf stat \-- make
553 83723.452481 task-clock:u (msec) # 1.004 CPUs utilized
554 0 context-switches:u # 0.000 K/sec
555 0 cpu-migrations:u # 0.000 K/sec
556 3,228,188 page-faults:u # 0.039 M/sec
560 2,078,861,393 branch-misses:u # 2.98% of all branches
568 -------
583 ----------
585 With -x, perf stat is able to output a not-quite-CSV format output
587 it is recommended to use a different character like -x \;
591 - optional usec time stamp in fractions of second (with -I xxx)
592 - optional CPU, core, or socket identifier
593 - optional number of logical CPUs aggregated
594 - counter value
595 - unit of the counter value or empty
596 - event name
597 - run time of counter
598 - percentage of measurement time the counter was running
599 - optional variance if multiple values are collected with -r
600 - optional metric value
601 - optional unit of metric
605 include::intel-hybrid.txt[]
608 -----------
610 With -j, perf stat is able to print out a JSON format output
613 - timestamp : optional usec time stamp in fractions of second (with -I)
614 - optional aggregate options:
615 - core : core identifier (with --per-core)
616 - die : die identifier (with --per-die)
617 - socket : socket identifier (with --per-socket)
618 - node : node identifier (with --per-node)
619 - thread : thread identifier (with --per-thread)
620 - counter-value : counter value
621 - unit : unit of the counter value or empty
622 - event : event name
623 - variance : optional variance if multiple values are collected (with -r)
624 - runtime : run time of counter
625 - metric-value : optional metric value
626 - metric-unit : optional unit of metric
629 --------
630 linkperf:perf-top[1], linkperf:perf-list[1]