resctrl.rst - OpenGrok cross reference for /Linux-v6.1/Documentation/x86/resctrl.rst

Lines Matching +full:50 +full:a
45 RDT features are orthogonal. A particular system may support only
47 pseudo-locking is a unique way of using cache control to "pin" or
79 		must be set when writing a mask.
94 			      resources have been allocated and a "0" is found
95 			      in "bit_usage" it is a sign that resources are
100 			      but available for software use. If a resource
110 			      well as a resource group's allocation.
143 		of a physical core are throttled in cases where they
167 		bytes) at which a previously used LLC_occupancy
170 Finally, in the top level of the "info" directory there is a file
191 On a system with RDT control features additional directories can be
196 On a system with RDT monitoring the root directory and other top level
197 directories contain a directory named "mon_groups" in which additional
202 Removing a directory will move all tasks and cpus owned by the group it
210 	this group. Writing a task id to the file will add a task to the
211 	group. If the group is a CTRL_MON group the task is removed from
213 	any MON group that owned the task. If the group is a MON group,
219 	Reading this file shows a bitmask of the logical CPUs owned by
220 	this group. Writing a mask to this file will add and remove
221 	CPUs to/from this group. As with the tasks file a hierarchy is
236 	A list of all the resources available to this group.
246 	allocations. A "shareable" resource group allows sharing of its
247 	allocations while an "exclusive" resource group does not. A
257 	This contains a set of files organized by L3 domain and by
258 	RDT event. E.g. on a system with two L3 domains there will
261 	"mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
262 	files provide a read out of the current value of the event for
270 When a task is running the following rules define which resources are
273 1) If the task is a member of a non-default group, then the schemata
276 2) Else if the task belongs to the default group, but is running on a
284 1) If a task is a member of a MON group, or non-default CTRL_MON group
287 2) If a task is a member of the default CTRL_MON group, but is running
288    on a CPU that is assigned to some specific group, then the RDT events
297 When moving a task from one group to another you should remember that
299 a task in a monitor group showing 3 MB of cache occupancy. If you move
300 to a new group and immediately check the occupancy of the old and new
303 before the move, the h/w does not update any counters. On a busy system
309 The same applies to cache allocation control. Moving a task to a group
310 with a smaller cache partition will not evict any cache lines. The
314 to identify a control group and a monitoring group respectively. Each of
317 a "CTRL_MON" directory may fail if we run out of either CLOSID or RMID
326 occupancy has gone down. If there is a time when system has a lot of
330 max_threshold_occupancy is a user configurable value to determine the
342 caches are generally just shared by the hyperthreads on a core, but this
344 caches on a socket, multiple cores could share an L2 cache. So instead
346 a resource we use a "Cache ID". At a given cache level this will be a
347 unique number across the whole system (but it isn't guaranteed to be a
354 for allocation using a bitmask. The maximum value of the mask is defined
358 requires that these masks have all the '1' bits in a contiguous block. So
360 and 0xA are not.  On a system with a 20-bit mask each bit represents 5%
377 The bandwidth throttling is a core specific mechanism on some of Intel
378 SKUs. Using a high bandwidth and a low bandwidth setting on two threads
379 sharing a core may result in both threads being throttled to use the
382 The fact that Memory bandwidth allocation(MBA) may be a core
392 external bandwidth. Consider an SKL SKU with 24 cores on a package and
394 240GBps) and L3 external bandwidth is 100GBps. Now a workload with '20
395 threads, having 50% bandwidth, each consuming 5GBps' consumes the max L3
396 bandwidth of 100GBps although the percentage value specified is only 50%
405 For the same SKU in #1, a 'single thread, with 10% bandwidth' and '4
413 kernel underneath would use a software feedback mechanism or a "Software
421 a mount option 'mba_MBps'. The schemata format is specified in the below
484 CAT enables a user to specify the amount of cache space that an
485 application can fill. Cache pseudo-locking builds on the fact that a
487 allocated area on a cache hit. With cache pseudo-locking, data can be
488 preloaded into a reserved portion of cache that no application can
492 a region of memory with reduced average read latency.
494 The creation of a cache pseudo-locked region is triggered by a request
495 from the user to do so that is accompanied by a schemata of the region
498 - Create a CAT allocation CLOSNEW with a CBM matching the schemata
503 - Create a contiguous region of memory of the same size as the cache
516   user-space as a character device.
526 It is required that an application using a pseudo-locked region runs
527 with affinity to the cores (or a subset of the cores) associated
528 with the cache on which the pseudo-locked region resides. A sanity check
537 1) During the first stage the system administrator allocates a portion
540    cache portion, and exposed as a character device.
541 2) During the second stage a user-space application maps (mmap()) the
546 A pseudo-locked region is created using the resctrl interface as follows:
548 1) Create a new resource group by creating a new directory in /sys/fs/resctrl.
556 "pseudo-locked" and a new character device with the same name as the resource
567 There is no explicit way for the kernel to test if a provided memory
573    from these measurements are best visualized using a hist trigger (see
575    a stride of 32 bytes while hardware prefetchers and preemption
576    are disabled. This also provides a substitute visualization of cache
582 When a pseudo-locked region is created a new debugfs directory is created for
583 it in debugfs as /sys/kernel/debug/resctrl/<newdir>. A single
606 In this example a pseudo-locked region named "newlock" was created. Here is
608 visualize this data with a histogram that is available if CONFIG_HIST_TRIGGERS
624   { latency:         50 } hitcount:         83
640 In this example a pseudo-locked region named "newlock" was created on the L2
641 cache of a platform. Here is how we can obtain details of the cache hits
668 On a two socket machine (one L3 cache per socket) with just four bits
669 for cache bit masks, minimum b/w of 10% with a memory bandwidth
676   # echo "L3:0=3;1=c\nMB:0=50;1=50" > /sys/fs/resctrl/p0/schemata
677   # echo "L3:0=3;1=3\nMB:0=50;1=50" > /sys/fs/resctrl/p1/schemata
683 "lower" 50% on cache ID 0, and the "upper" 50% of cache ID 1.
684 Tasks in group "p1" use the "lower" 50% of cache on both sockets.
686 Similarly, tasks that are under the control of group "p0" may use a
687 maximum memory b/w of 50% on socket0 and 50% on socket 1.
688 Tasks in group "p1" may also use 50% memory b/w on both sockets.
701 In the above example the tasks in "p1" and "p0" on socket 0 would use a max b/w
706 Again two sockets, but this time with a more realistic 20-bit mask.
709 processor 1 on socket 0 on a 2-socket and dual core machine. To avoid noisy
718 50% of the L3 cache on socket 0 and 50% of memory b/w cannot be used by
721   # echo "L3:0=3ff;1=fffff\nMB:0=50;1=100" > schemata
723 Next we make a resource group for our first real time task and give
731 also use taskset(1) to ensure the task always runs on a dedicated CPU
763 A single socket system which has real-time tasks running on core 4-7 and
765 and data, so a per task association is not required and due to interaction
774 50% of the L3 cache on socket 0, and 50% of memory bandwidth on socket 0
777   # echo "L3:0=3ff\nMB:0=50" > schemata
779 Next we make a resource group for our real time cores and give it access
780 to the "top" 50% of the cache on socket 0 and 50% of memory bandwidth on
785   # echo "L3:0=ffc00\nMB:0=50" > p0/schemata
788 kernel and the tasks running there get 50% of the cache. They should
789 also get 50% of memory bandwidth assuming that the cores 4-7 are SMT
799 configures a cache allocation then nothing prevents another resource group
802 In this example a new exclusive resource group will be created on a L2 CAT
842 A new resource group will on creation not overlap with an exclusive resource
857 A resource group cannot be forced to overlap with an exclusive resource group::
884 Create a new resource group that will be associated with the pseudo-locked
885 region, indicate that it will be used for a pseudo-locked region, and
979   2. Find a contiguous set of bits in the global CBM bitmask that is clear
981   3. Create a new directory
991 Locking is based on flock, which is available in libc and also as a shell
996  A) Take flock(LOCK_EX) on /sys/fs/resctrl
1002  A) Take flock(LOCK_SH) on /sys/fs/resctrl
1095 On a two socket machine (one L3 cache per socket) with just four bits
1110 "lower" 50% on cache ID 0, and the "upper" 50% of cache ID 1.
1111 Tasks in group "p1" use the "lower" 50% of cache on both sockets.
1113 Create monitor groups and assign a subset of tasks to each monitor group.
1137 Example 2 (Monitor a task from its creation)
1139 On a two socket machine (one L3 cache per socket)::
1160 Assume a system like HSW has only CQM and no CAT support. In this case
1195 A single socket system which has real time tasks running on cores 4-7