Lines Matching +full:sub +full:- +full:sampled
1 perf-intel-pt(1)
5 ----
6 perf-intel-pt - Support for Intel Processor Trace within perf tools
9 --------
11 'perf record' -e intel_pt//
14 -----------
19 Technical details are documented in the Intel 64 and IA-32 Architectures
23 processors that are based on the Intel micro-architecture code name Broadwell.
33 Decoding is done on-the-fly. The decoder outputs samples in the same format as
43 builds, however the executed images are needed - which makes use in JIT-compiled
44 environments, or with self-modified code, a challenge. Also symbols need to be
51 vary depending on the use-case and architecture.
55 ----------
61 Data is captured with 'perf record' e.g. to trace 'ls' userspace-only:
63 perf record -e intel_pt//u ls
69 To also trace kernel space presents a problem, namely kernel self-modifying
73 --kcore is used, but access to /proc/kcore is restricted e.g.
75 sudo perf record -o pt_ls --kcore -e intel_pt// -- ls
82 sudo perf report -i pt_ls
84 Because samples are synthesized after-the-fact, the sampling period can be
87 sudo perf report pt_ls --itrace=i1usge
89 See the sections below for more information about the --itrace option.
103 perf record -e intel_pt//u ls
104 perf script --itrace=ibxwpe
109 perf script --itrace=ibxwpe -F+flags
113 in transaction, VM-entry, VM-exit, interrupt disabled, and interrupt disable
118 perf script --insn-trace --xed
124 perf script --call-trace
128 perf script --call-ret-trace
133 perf script --time starttime,stoptime --insn-trace --xed
136 the -C option
138 perf script --time starttime,stoptime --insn-trace --xed -C 1
145 perf script --itrace=be -F+ipc
147 There are two ways that instructions-per-cycle (IPC) can be calculated depending
152 used - refer to the 'mtc' config term. When MTC is used, however, the values
169 useful to use the 'A' option in conjunction with dlfilter-show-cycles.so to
178 Another note, in the case of "branches" events, non-taken branches are not
179 presently sampled, so IPC values for them do not appear e.g. a CYC packet with a
180 TNT packet that starts with a non-taken branch. To see every possible IPC
181 value, "instructions" events can be used e.g. --itrace=i0ns
185 Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
186 and to script exported-sql-viewer.py for an example of using the database.
188 There is also script intel-pt-events.py which provides an example of how to
192 --insn-trace - instruction trace
193 --src-trace - source trace
200 by inability to access the executed image, self-modified or JIT-ed code, or the
201 inability to match side-band information (such as context switches and mmaps)
210 -----------
219 -e intel_pt//
223 -e intel_pt/tsc,noretcomp=0/
227 -e intel_pt/tsc=1,noretcomp=0/
229 Note there are now new config terms - see section 'config terms' further below.
236 $ grep -H . /sys/bus/event_source/devices/intel_pt/format/*
238 /sys/bus/event_source/devices/intel_pt/format/cyc_thresh:config:19-22
240 /sys/bus/event_source/devices/intel_pt/format/mtc_period:config:14-17
242 /sys/bus/event_source/devices/intel_pt/format/psb_period:config:24-27
247 -e intel_pt/noretcomp=0/
251 -e intel_pt/tsc=1,noretcomp=0/
255 -e intel_pt/tsc=0/
259 -e intel_pt/config=0x400/
274 perf_event_attr is displayed if the -vv option is used e.g.
276 ------------------------------------------------------------
290 ------------------------------------------------------------
291 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
292 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
293 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
294 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
295 ------------------------------------------------------------
301 The June 2015 version of Intel 64 and IA-32 Architectures Software Developer
308 without timing information, for example a per-thread context
348 $ perf record -e intel_pt/psb_period=15/u uname
349 Invalid psb_period for intel_pt. Valid values are: 0-5
375 The frequency of MTC packets can also be specified - see
378 mtc_period Specifies how frequently MTC packets are produced - see mtc
390 CTC-frequency / (2 ^ value)
392 e.g. value 3 means one eighth of CTC-frequency
400 $ perf record -e intel_pt/mtc_period=15/u uname
421 a threshold - see cyc_thresh below.
423 cyc_thresh Specifies how frequently CYC packets are produced - see cyc
437 2 ^ (value - 1)
446 $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname
447 Invalid cyc_thresh for intel_pt. Valid values are: 0-12
451 pt Specifies pass-through which enables the 'branch' config term.
480 changes to the CPU C-state.
502 return compression is disabled - see noretcomp) return statements.
519 --aux-sample
523 --aux-sample=8192
527 -e intel_pt//u
530 following will create Intel PT samples on the branch-misses event, note the
533 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
535 An alternative to '--aux-sample' is to add the config term 'aux-sample-size' to
538 perf record -e intel_pt//u -e branch-misses/aux-sample-size=8192/u
542 perf record -e '{intel_pt//u,branch-misses/aux-sample-size=8192/u}'
546 …perf record -e intel_pt//u --filter 'filter * @/bin/ls' -e branch-misses/aux-sample-size=8192/u --…
576 -S
580 -S0x100000
588 The snapshot size is displayed if the option -vv is used e.g.
596 Intel PT buffer size is specified by an addition to the -m option e.g.
598 -m,16
602 Note that the existing functionality of -m is unchanged. The auxtrace mmap size
616 In full-trace mode, powers of two are allowed for buffer size, with a minimum
620 The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g.
630 full-trace mode
634 Full-trace mode traces continuously e.g.
636 perf record -e intel_pt//u uname
640 perf record --aux-sample -e intel_pt//u -e branch-misses:u
645 perf record -v -e intel_pt//u -S ./loopy 1000000000 &
647 kill -USR2 11435
651 Note that "Recording AUX area tracing snapshot" is displayed because the -v
661 $ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 &
663 $ ps -e | grep perf
665 $ kill -USR2 15244
666 bash: kill: (15244) - Operation not permitted
689 In full-trace mode, the driver waits for data to be copied out before allowing
690 the (logical) buffer to wrap-around. If data is not copied out quickly enough,
693 that happens, perf tools always re-enable the intel_pt event after copying out
700 By default "perf record" post-processes the event stream to find all build ids
701 for executables for all addresses sampled. Deliberately, Intel PT is not
708 perf buildid-list
712 perf buildid-list --with-hits
720 collection of side-band information. In order to prevent that, a dummy
723 there is complete side-band information to allow the decoding of subsequent
746 "per thread" mode is selected by -t or by --per-thread (with -p or -u or just a
748 "per cpu" is selected by -C or -a.
752 In per-thread mode an exact list of threads is traced. There is no inheritance.
755 In per-cpu mode all processes (or processes from the selected cgroup i.e. -G
756 option, or processes selected with -p or -u) are traced. Each cpu has its own
759 In workload-only mode, the workload is traced but with per-cpu buffers.
760 Inheritance is allowed. Note that you can now trace a workload in per-thread
761 mode by using the --per-thread option.
764 Privileged vs non-privileged users
767 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users
783 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users are
784 not permitted to use tracepoints which means there is insufficient side-band
785 information to decode Intel PT in per-cpu mode, and potentially workload-only
788 Note also, that to use tracepoints, read-access to debugfs is required. So if
789 debugfs is not mounted or the user does not have read-access, it will again not
790 be possible to decode Intel PT in per-cpu mode.
796 The sched_switch tracepoint is used to provide side-band data for Intel PT
803 $ perf record -vv -e intel_pt//u uname
804 ------------------------------------------------------------
818 ------------------------------------------------------------
819 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
820 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
821 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
822 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
823 ------------------------------------------------------------
834 ------------------------------------------------------------
835 sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
836 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
837 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
838 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
839 ------------------------------------------------------------
858 ------------------------------------------------------------
859 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
860 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
861 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
862 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
872 and only in per-cpu mode.
880 -----------
883 This can be further controlled by new option --itrace.
886 New --itrace option
891 --itrace
895 --itrace=cepwx
906 o synthesize PEBS-via-PT events
917 Z prefer to ignore timestamps (so-called "timeless" decoding)
919 "Instructions" events look like they were recorded by "perf record -e
922 "Branches" events look like they were recorded by "perf record -e branches". "c"
940 "Power" events correspond to power event packets and CBR (core-to-bus ratio)
944 C-state changes, whereas CBR is indicative of CPU frequency. perf script
949 pwre: hw: 0 cstate: 2 sub-cstate: 0
955 "cbr" includes the frequency and the percentage of maximum non-turbo
957 "pwre" shows C-state transitions (to a C-state deeper than C0) and
963 For more details refer to the Intel 64 and IA-32 Architectures Software
966 PSB events show when a PSB+ occurred and also the byte-offset in the trace.
974 will or will not be reported. Each flag must be preceded by either '+' or '-'.
977 -o Suppress overflow errors
978 -l Suppress trace data lost errors
982 --itrace=e-o-l
988 must be preceded by either '+' or '-'. The flags support by Intel PT are:
990 -a Suppress logging of perf events
998 linkperf:perf-config[1] e.g. perf config itrace.debug-log-buffer-size=30000
1002 --itrace=i10us
1020 'instructions' (i.e. --itrace=i1i).
1025 --itrace=ig32
1026 --itrace=xg32
1031 --itrace=il10
1032 --itrace=xl10
1039 instead of synthesized events. For example, to record branch-misses events for
1042 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls
1043 perf report --itrace=Ge
1055 - hardware supports it
1056 - PEBS is used
1057 - event period is specified, instead of frequency
1058 - the sample type is limited to the following flags:
1067 cases, avoid specifying the event period i.e. avoid the 'perf record' -c option,
1068 --count option, or 'period' config term.
1070 To disable trace decoding entirely, use the option --no-itrace.
1075 --itrace=i0nss1000000
1086 ranges that could then be decoded fully using the --time option.
1090 - direct calls and jmps
1091 - conditional branches
1092 - non-branch instructions
1096 - asynchronous branches such as interrupts
1097 - indirect branches
1098 - function return target address *if* the noretcomp config term (refer
1100 - start of (control-flow) tracing
1101 - end of (control-flow) tracing, if it is not out of context
1102 - power events, ptwrite, transaction start and abort
1103 - instruction pointer associated with PSB packets
1108 Repeating the q option (double-q i.e. qq) results in even faster decoding and even
1117 - everything except instruction pointer associated with PSB packets
1121 - instruction pointer associated with PSB packets
1128 dlfilter-show-cycles.so
1131 Cycles can be displayed using dlfilter-show-cycles.so in which case the itrace A
1134 perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so
1138 perf script -v --list-dlfilters
1140 See also linkperf:perf-dlfilters[1]
1146 perf script has an option (-D) to "dump" the events i.e. display the binary
1149 When -D is used, Intel PT packets are displayed. The packet decoder does not
1150 pay attention to PSB packets, but just decodes the bytes - so the packets seen
1152 One example of that would be when the buffer-switching interrupt has been too
1157 To disable the display of Intel PT packets, combine the -D option with
1158 --no-itrace.
1162 -----------
1165 This can be further controlled by new option --itrace exactly the same as
1166 perf script, with the exception that the default is --itrace=igxe.
1170 -----------
1172 perf inject also accepts the --itrace option in which case tracing data is
1175 perf inject --itrace -i perf.data -o perf.data.new
1182 $ gcc-5 -O3 sort.c -o sort_optimized
1188 [intel-pt]
1189 mispred-all = on
1191 $ perf record -e intel_pt//u ./sort 3000
1196 $ perf inject -i perf.data -o inj --itrace=i100usle --strip
1197 $ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
1198 $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
1208 -----------------
1211 Recording is selected by using the aux-output config term e.g.
1213 perf record -c 10000 -e '{intel_pt/branch=0/,cycles/aux-output/ppp}' uname
1217 kernels and perf tools add support for the PERF_RECORD_AUX_OUTPUT_HW_ID side-band event.
1218 To check for the presence of that event in a PEBS-via-PT trace:
1220 perf script -D --no-itrace | grep PERF_RECORD_AUX_OUTPUT_HW_ID
1224 perf script --itrace=oe
1227 ---
1229 include::build-xed.txt[]
1233 --------------------------------------
1243 …Guest kernel self-modifying code (e.g. jump labels or JIT-compiled eBPF) will result in decoding e…
1255 Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root …
1258 $ sshfs -o direct_io root@vm0:/ vm0
1262 $ perf buildid-cache -v --kcore vm0/proc/kcore
1263 …kcore added to build-id cache directory /home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d…
1268 $ ps -eLl | grep 'KVM\|PID'
1270 3 S 64055 1430 1 1440 1 80 0 - 1921718 - ? 00:02:47 CPU 0/KVM
1271 3 S 64055 1430 1 1441 1 80 0 - 1921718 - ? 00:02:41 CPU 1/KVM
1272 3 S 64055 1430 1 1442 1 80 0 - 1921718 - ? 00:02:38 CPU 2/KVM
1273 3 S 64055 1430 1 1443 2 80 0 - 1921718 - ? 00:03:18 CPU 3/KVM
1275 Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to…
1279 Intel PT traces both the host and the guest so --guest and --host need to be specified.
1280 Without timestamps, --per-thread must be specified to distinguish threads.
1282 …$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/tsc=0,mtc=0,cy…
1289 $ perf script --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmresume | head -21
1319 Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root …
1321 $ mkdir -p vm0
1322 $ sshfs -o direct_io root@vm0:/ vm0
1326 $ perf buildid-cache -v --kcore vm0/proc/kcore
1332 $ ps -eLl | grep 'KVM\|PID'
1334 3 S 64055 16998 1 17005 13 80 0 - 1818189 - ? 00:00:16 CPU 0/KVM
1335 3 S 64055 16998 1 17006 4 80 0 - 1818189 - ? 00:00:05 CPU 1/KVM
1336 3 S 64055 16998 1 17007 3 80 0 - 1818189 - ? 00:00:04 CPU 2/KVM
1337 3 S 64055 16998 1 17008 4 80 0 - 1818189 - ? 00:00:05 CPU 3/KVM
1339 Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to…
1342 Intel PT traces both the host and the guest so --guest and --host need to be specified.
1344 …$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/cyc=1/k -p 169…
1349 only 7-bytes, so the TSC Offset might differ from the actual value in the 8th byte. That will
1352 $ perf inject -i perf.data.kvm --vm-time-correlation=dry-run
1368 $ perf inject -i perf.data.kvm --vm-time-correlation="dry-run 0xffffe42722c64c41"
1370 Note the options for 'perf inject' --vm-time-correlation are:
1372 [ dry-run ] [ <TSC Offset> [ : <VMCS> [ , <VMCS> ]... ] ]...
1375 The option "dry-run" will cause the file to be processed but without updating it.
1376 Note it is also possible to get a intel_pt.log file by adding option --itrace=d
1380 $ perf inject -i perf.data.kvm --vm-time-correlation=0xffffe42722c64c41 --force
1384 $ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --itrace=e-o
1390 …$ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmr…
1415 -----------------------------------------------
1424 Check that no-kvmclock kernel command line option was used to boot:
1429 …BOOT_IMAGE=/boot/vmlinuz-5.10.0-16-amd64 root=UUID=cb49c910-e573-47e0-bce7-79e293df8e1d ro no-kvmc…
1438 …$ sudo perf record -o guest-sideband-testing-guest-perf.data --sample-identifier --buildid-all --s…
1446 $ sudo perf record -o guest-sideband-testing-host-perf.data -m,64M --kcore -a -e intel_pt/cyc/
1461 [ perf record: Captured and wrote 76.122 MB guest-sideband-testing-host-perf.data ]
1469 [ perf record: Captured and wrote 1.247 MB guest-sideband-testing-guest-perf.data ]
1471 And then copy guest-sideband-testing-guest-perf.data to the host (not shown here).
1479 $ perf inject -i guest-sideband-testing-host-perf.data --vm-time-correlation=dry-run
1487 …$ perf inject -i guest-sideband-testing-host-perf.data --vm-time-correlation=0xfffffa6ae070cb20 --…
1491 $ perf script -i guest-sideband-testing-host-perf.data --no-itrace --show-task-events | grep KVM
1497 Note, the QEMU option -name debug-threads=on is needed so that thread names
1502 $ mkdir -p ~/guestmount/13376
1503 $ sshfs -o direct_io vm_to_test:/ ~/guestmount/13376
1508 If needed, VDSO can be copied manually in a fashion similar to that used by the perf-archive script.
1510 …$ perf inject -i guest-sideband-testing-host-perf.data -o inj --guestmount ~/guestmount --guest-da…
1516 - the CPU displayed, [002] in this case, is always the host CPU
1517 …- events happening in the virtual machine start with VM:13376 VCPU:003, which shows the hypervisor…
1518 - only calls and errors are displayed i.e. --itrace=ce
1519 …- branches entering and exiting the virtual machine are split, and show as 2 branches to/from "0 […
1521 …$ perf script -i inj --itrace=ce -F+machine_pid,+vcpu,+addr,+pid,+tid,-period --ns --time 7919.408…
1526 …nown] ([unknown]) => 7f851c9b5a5c init_cacheinfo+0x3ac (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
1527 … branches: 7f851c9b5a5a init_cacheinfo+0x3aa (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => …
1577 …nown] ([unknown]) => 7f851c9b5a5c init_cacheinfo+0x3ac (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
1578 …dl_init+0x74 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) => 7f851cb7bf50 call_init.part.0+0x0 (/usr…
1587 Tracing Virtual Machines - Guest Code
1588 -------------------------------------
1595 addresses. To support that, option "--guest-code" has been added to perf script
1600 …# perf record --kcore -e intel_pt/cyc/ -- tools/testing/selftests/kselftest_install/kvm/tsc_msrs_t…
1603 # perf script --guest-code --itrace=bep --ns -F-period,+addr,+flags
1633 # perf kvm --guest-code --guest --host report -i perf.data --stdio | head -20
1635 # To display the perf.data header info, please use --header/--header-only options.
1648 ---entry_SYSCALL_64_after_hwframe
1651 |--29.44%--syscall_exit_to_user_mode
1658 -----------
1671 7 VMENTRY VM-Entry
1672 8 VMEXIT VM-Entry
1673 9 VMEXIT_INTR VM-Exit due to interrupt
1676 For more details, refer to the Intel 64 and IA-32 Architectures Software
1684 perf record -e intel_pt/event/u uname
1686 Event trace events are output using the --itrace I option. e.g.
1688 perf script --itrace=Ie
1701 iflag: t IFLAG: 1->0 via branch
1711 t interrupts become disabled IF=1 -> IF=0
1713 Dt interrupts become enabled IF=0 -> IF=1
1715 The intel-pt-events.py script illustrates how to access Event Trace information
1720 -----------
1724 perf record -e intel_pt/notnt/u uname
1726 In that case the --itrace q option is forced because walking executable code
1731 ----------------
1807 $ gcc -Wall -Wextra -O3 -g -o eg_ptw eg_ptw.c
1808 $ perf record -e intel_pt//u ./eg_ptw 0x1234567890abcdef
1811 $ perf script --itrace=ew
1817 -------
1825 --------
1827 linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1],
1828 linkperf:perf-inject[1]