Lines Matching +full:high +full:- +full:performance

7-lines are fetched from the memory subsystem; parsed into instructions; and lastly decoded into mi…
15 …he CPU was stalled due to Frontend latency issues. For example; instruction-cache misses; iTLB mi…
31 … corrected path; following all sorts of miss-predicted branches. For example; branchy code with lo…
39 …o switches from DSB to MITE pipelines. The DSB (decoded i-cache) is a Uop Cache where the front-en…
55-cache) or MITE (legacy instruction decode) pipelines. Certain operations cannot be handled native…
60 "MetricExpr": "tma_frontend_bound - tma_fetch_latency",
68 …"MetricExpr": "(UOPS_ISSUED.ANY - UOPS_RETIRED.RETIRE_SLOTS + 4 * ((INT_MISC.RECOVERY_CYCLES_ANY /…
71 …s for which the issue-pipeline was blocked due to recovery from earlier incorrect speculation. For…
79 …etched from an incorrectly speculated program path; or stalls when the out-of-order part of the ma…
84 "MetricExpr": "tma_bad_speculation - tma_branch_mispredicts",
87-of-order portion of the machine needs to recover its state after the clear. For example; this can…
92 "MetricExpr": "1 - (tma_frontend_bound + tma_bad_speculation + tma_retiring)",
95-of-order scheduler dispatches ready uops into their respective execution units; and once complete…
100 …CHED.THREAD\\,cmask\\=1@ - cpu@UOPS_DISPATCHED.THREAD\\,cmask\\=3@ if (IPC > 1.8) else cpu@UOPS_DI…
103 …o demand load or store instructions. This accounts mainly for (1) non-completed in-flight memory d…
111-aside Buffers) are processor caches for recently used entries out of the Page Tables that are use…
119 …e misses (i.e. L2 misses/L3 hits) can improve the latency and increase performance. Sample with: M…
124 …"MetricExpr": "(1 - (MEM_LOAD_UOPS_RETIRED.LLC_HIT / (MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 * MEM_LOAD…
127 …y (DRAM) by loads. Better caching can improve the latency and increase performance. Sample with: M…
131 …"BriefDescription": "This metric estimates fraction of cycles where the core's performance was lik…
135performance was likely hurt due to approaching bandwidth limits of external memory (DRAM). The un…
139 …"BriefDescription": "This metric estimates fraction of cycles where the performance was likely hur…
140 …CLK_UNHALTED.THREAD, OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD) / CLKS - tma_mem_bandwidth",
143 …"PublicDescription": "This metric estimates fraction of cycles where the performance was likely hu…
147 … CPU was stalled due to RFO store memory accesses; RFO store issue a read-for-ownership request b…
151 …ses; RFO store issue a read-for-ownership request before the write. Even though store accesses do …
155 …"BriefDescription": "This metric represents fraction of slots where Core non-memory issues were of…
156 "MetricExpr": "tma_backend_bound - tma_memory_bound",
159-memory issues were of a bottleneck. Shortage in hardware compute resources; or dependencies in s…
171 …stimates fraction of cycles the CPU performance was potentially limited due to Core computation is…
172- cpu@UOPS_DISPATCHED.THREAD\\,cmask\\=3@ if (IPC > 1.8) else cpu@UOPS_DISPATCHED.THREAD\\,cmask\\…
175performance was potentially limited due to Core computation issues (non divider-related). Two dis…
183-per-cycle (see IPC metric). Note that a high Retiring value does not necessary mean there is no r…
187 …slots where the CPU was retiring light-weight operations -- instructions that require no more than…
188 "MetricExpr": "tma_retiring - tma_heavy_operations",
191-weight operations -- instructions that require no more than one uop (micro-operation). This corre…
195 …"BriefDescription": "This metric represents overall arithmetic floating-point (FP) operations frac…
199-point (FP) operations fraction the CPU has executed (retired). Note this metric's value may excee…
207 …FP arithmetic operations; hence may be used as a thermometer to avoid X87 high usage and preferabl…
211 …"BriefDescription": "This metric approximates arithmetic floating-point (FP) scalar uops fraction …
215 …"PublicDescription": "This metric approximates arithmetic floating-point (FP) scalar uops fraction…
219 …"BriefDescription": "This metric approximates arithmetic floating-point (FP) vector uops fraction …
223 …"PublicDescription": "This metric approximates arithmetic floating-point (FP) vector uops fraction…
227 …tric represents fraction of slots where the CPU was retiring heavy-weight operations -- instructio…
231 …he CPU was retiring heavy-weight operations -- instructions that require two or more uops or micro…
261 … "BriefDescription": "Per-Logical Processor actual clocks when the Logical Processor is active.",
267 …"BriefDescription": "Total issue-pipeline slots (per-Physical Core till ICL; per-Logical Processor…
273 "BriefDescription": "The ratio of Executed- by Issued-Uops",
277 …iption": "The ratio of Executed- by Issued-Uops. Ratio > 1 suggests high rate of uop micro-fusions…
280 "BriefDescription": "Instructions Per Cycle across hyper-threads (per physical core)",
292 …"BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is …
338 … supported options of: FP precisions, scalar and vector instructions, vector-width and AMX engine."
348 …"MetricExpr": "1 - CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / (CPU_CLK_UNHALTED.REF_XCLK_ANY / 2) if #SM…
384 "MetricExpr": "(cstate_core@c3\\-residency@ / msr@tsc@) * 100",
390 "MetricExpr": "(cstate_core@c6\\-residency@ / msr@tsc@) * 100",
396 "MetricExpr": "(cstate_core@c7\\-residency@ / msr@tsc@) * 100",
402 "MetricExpr": "(cstate_pkg@c2\\-residency@ / msr@tsc@) * 100",
408 "MetricExpr": "(cstate_pkg@c3\\-residency@ / msr@tsc@) * 100",
414 "MetricExpr": "(cstate_pkg@c6\\-residency@ / msr@tsc@) * 100",
420 "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",