vm.rst - OpenGrok cross reference for /Linux-v5.15/Documentation/admin-guide/sysctl/vm.rst

Lines Matching +full:in +full:- +full:memory
11 For general info and legal blurb, please look in index.rst.
13 ------------------------------------------------------------------------------
15 This file contains the documentation for the sysctl files in
18 The files in this directory can be used to tune the operation
19 of the virtual memory (VM) subsystem of the Linux kernel and
23 files can be found in mm/swap.c.
25 Currently, these files are in /proc/sys/vm:
27 - admin_reserve_kbytes
28 - compact_memory
29 - compaction_proactiveness
30 - compact_unevictable_allowed
31 - dirty_background_bytes
32 - dirty_background_ratio
33 - dirty_bytes
34 - dirty_expire_centisecs
35 - dirty_ratio
36 - dirtytime_expire_seconds
37 - dirty_writeback_centisecs
38 - drop_caches
39 - extfrag_threshold
40 - highmem_is_dirtyable
41 - hugetlb_shm_group
42 - laptop_mode
43 - legacy_va_layout
44 - lowmem_reserve_ratio
45 - max_map_count
46 - memory_failure_early_kill
47 - memory_failure_recovery
48 - min_free_kbytes
49 - min_slab_ratio
50 - min_unmapped_ratio
51 - mmap_min_addr
52 - mmap_rnd_bits
53 - mmap_rnd_compat_bits
54 - nr_hugepages
55 - nr_hugepages_mempolicy
56 - nr_overcommit_hugepages
57 - nr_trim_pages         (only if CONFIG_MMU=n)
58 - numa_zonelist_order
59 - oom_dump_tasks
60 - oom_kill_allocating_task
61 - overcommit_kbytes
62 - overcommit_memory
63 - overcommit_ratio
64 - page-cluster
65 - panic_on_oom
66 - percpu_pagelist_high_fraction
67 - stat_interval
68 - stat_refresh
69 - numa_stat
70 - swappiness
71 - unprivileged_userfaultfd
72 - user_reserve_kbytes
73 - vfs_cache_pressure
74 - watermark_boost_factor
75 - watermark_scale_factor
76 - zone_reclaim_mode
82 The amount of free memory in the system that should be reserved for users
87 That should provide enough for the admin to log in and kill a process,
91 for the full Virtual Memory Size of programs used to recover. Otherwise,
92 root may not be able to log in to recover the system.
105 Changing this takes effect whenever an application requests memory.
112 all zones are compacted such that free memory is available in contiguous
113 blocks where possible. This can be important for example in the allocation of
114 huge pages although processes will also directly compact memory as required.
119 This tunable takes a value in the range [0, 100] with a default value of
120 20. This tunable determines how aggressively compaction is done in the
124 Note that compaction has a non-trivial system-wide impact as pages
126 to latency spikes in unsuspecting applications. The kernel employs
139 acceptable trade for large contiguous free memory.  Set to 0 to prevent
141 On CONFIG_PREEMPT_RT the default value is 0 in order to avoid a page fault, due
149 Contains the amount of dirty memory at which the background kernel
155   immediately taken into account to evaluate the dirty memory limits and the
162 Contains, as a percentage of total available memory that contains free pages
166 The total available memory is not equal to total system memory.
172 Contains the amount of dirty memory at which a process generating disk writes
177 account to evaluate the dirty memory limits and the other appears as 0 when
180 Note: the minimum value allowed for dirty_bytes is two pages (in bytes); any
189 for writeout by the kernel flusher threads.  It is expressed in 100'ths
190 of a second.  Data which has been dirty in-memory for longer than this
197 Contains, as a percentage of total available memory that contains free pages
201 The total available memory is not equal to total system memory.
220 out to disk.  This tunable expresses the interval between those wakeups, in
231 memory becomes free.
245 This is a non-destructive operation and will not free any dirty objects.
253 reclaimed by the kernel when memory is needed elsewhere on the system.
260 You may see informational messages in your kernel log when this file is
272 This parameter affects whether the kernel will compact memory or direct
273 reclaim to satisfy a high-order allocation. The extfrag/extfrag_index file in
274 debugfs shows what the fragmentation index for each order is in each zone in
276 of memory, values towards 1000 imply failures are due to fragmentation and -1
279 The kernel will not compact memory in a zone if the
288 This parameter controls whether the high memory is considered for dirty
290 only the amount of memory directly visible/usable by the kernel can
291 be dirtied. As a result, on systems with a large amount of memory and
295 Changing the value to non zero would allow more memory to be dirtied
297 storage more effectively. Note this also comes with a risk of pre-mature
299 only use the low memory and they can fill it up with dirty data without
307 shared memory segment using hugetlb page.
314 controlled by this knob are discussed in Documentation/admin-guide/laptops/laptop-mode.rst.
320 If non-zero, this sysctl disables the new 32-bit mmap layout - the kernel
328 the kernel to allow process memory to be allocated from the "lowmem"
329 zone.  This is because that memory could then be pinned via the mlock()
332 And on large highmem machines this lack of reclaimable lowmem memory
338 captured into pinned user memory.
345 in defending these lower zones.
358 in /proc/zoneinfo like followings. (This is an example of x86-64 box).
378 In this example, if normal pages (index=2) are required to this DMA zone and
388     zone[i]->protection[j]
408 The minimum value is 1 (1/1 -> 100%). The value less than 1 completely
415 This file contains the maximum number of memory map areas a process
416 may have. Memory map areas are used as a side-effect of calling
430 Control how to kill processes when uncorrected memory error (typically
431 a 2bit error in a memory module) is detected in the background by hardware
432 that cannot be handled by the kernel. In some cases (like the page
458 Enable memory failure recovery (when supported by the platform)
462 0: Always panic on a memory failure.
470 watermark[WMARK_MIN] value for each lowmem zone in the system.
474 Some minimal amount of memory is needed to satisfy PF_MEMALLOC
486 A percentage of the total pages in each zone.  On Zone reclaim
488 than this percentage of pages in a zone are reclaimable slab pages.
489 This insures that the slab growth stays under control even in NUMA
494 Note that slab reclaim is triggered in a per zone / node fashion.
495 The process of reclaiming slab memory is currently not node specific
504 This is a percentage of the total pages in each zone. Zone reclaim will
505 only occur if more than this percentage of pages are in a state that
509 against all file-backed unmapped pages including swapcache pages and tmpfs
521 accidentally operate based on the information in the first couple of pages
522 of memory userspace processes should not be allowed to write to them.  By
525 vast majority of applications to work correctly and provide defense in depth
547 resulting from mmap allocations for applications run in
561 See Documentation/admin-guide/mm/hugetlbpage.rst
567 Change the size of the hugepage pool at run-time on a specific
570 See Documentation/admin-guide/mm/hugetlbpage.rst
579 See Documentation/admin-guide/mm/hugetlbpage.rst
587 This value adjusts the excess page trimming behaviour of power-of-2 aligned
596 See Documentation/admin-guide/mm/nommu-mmap.rst for more information.
605 'where the memory is allocated from' is controlled by zonelists.
610 In non-NUMA case, a zonelist for GFP_KERNEL is ordered as following.
611 ZONE_NORMAL -> ZONE_DMA
612 This means that a memory allocation request for GFP_KERNEL will
613 get memory from ZONE_DMA only when ZONE_NORMAL is not available.
615 In NUMA case, you can think of following 2 types of order.
618   (A) Node(0) ZONE_NORMAL -> Node(0) ZONE_DMA -> Node(1) ZONE_NORMAL
619   (B) Node(0) ZONE_NORMAL -> Node(1) ZONE_NORMAL -> Node(0) ZONE_DMA.
623 out-of-memory(OOM) of ZONE_DMA because ZONE_DMA is tend to be small.
638 On 32-bit, the Normal zone needs to be preserved for allocations accessible
641 On 64-bit, devices that require DMA32/DMA are relatively rare, so "node"
651 Enables a system-wide task dump (excluding kernel threads) to be produced
652 when the kernel performs an OOM-killing and includes such information as
660 the memory state information for each one.  Such systems should not
661 be forced to incur a performance penalty in OOM conditions when the
664 If this is set to non-zero, this information is shown whenever the
665 OOM killer actually kills a memory-hogging task.
673 This enables or disables killing the OOM-triggering task in
674 out-of-memory situations.
678 selects a rogue memory-hogging task that frees up a large amount of
679 memory when killed.
681 If this is set to non-zero, the OOM killer simply kills the task that
682 triggered the out-of-memory condition.  This avoids the expensive
686 is used in oom_kill_allocating_task.
705 This value contains a flag that enables memory overcommitment.
708 of free memory left when userspace requests more memory.
711 memory until it actually runs out.
714 policy that attempts to prevent any overcommit of memory.
718 programs that malloc() huge amounts of memory "just-in-case"
723 See Documentation/vm/overcommit-accounting.rst and
735 page-cluster
738 page-cluster controls the number of pages up to which consecutive pages
739 are read in from swap in a single attempt. This is the swap counterpart
741 The mentioned consecutivity is not in terms of virtual/physical addresses,
742 but consecutive on swap space - that means they were swapped out together.
744 It is a logarithmic value - setting it to zero means "1 page", setting
749 small benefits in tuning this to a different value if your workload is
750 swap-intensive.
754 that consecutive pages readahead would have brought in.
760 This enables or disables panic on out-of-memory feature.
766 If this is set to 1, the kernel panics when out-of-memory happens.
768 and those nodes become memory exhaustion status, one process
769 may be killed by oom-killer. No panic occurs in this case.
770 Because other nodes' memory may be free. This means system total status
774 above-mentioned. Even oom happens under memory cgroup, the whole
789 This is the fraction of pages in each zone that are can be stored to
790 per-cpu page lists. It is an upper boundary that is divided depending
792 that we do not allow more than 1/8th of pages in each zone to be stored
793 on per-cpu page lists. This entry only changes the value of hot per-cpu
795 each zone between per-cpu lists.
797 The batch value of each per-cpu page list remains the same regardless of
800 The initial value is zero. Kernel uses this value to set the high pcp->high
816 Any read or write (by root only) flushes all the per-cpu vm statistics
820 As a side-effect, it also checks for negative totals (elsewhere reported
821 as 0) and "fails" with EINVAL if any are found, with a warning in dmesg.
848 assumes equal IO cost and will thus apply memory pressure to the page
849 cache and swap-backed pages equally; lower values signify more
852 Keep in mind that filesystem IO patterns under memory pressure tend to
854 experimentation and will also be workload-dependent.
858 For in-memory swap, like zram or zswap, as well as hybrid setups that
865 file-backed pages is less than the high watermark in a zone.
871 This flag controls the mode in which unprivileged users can use the
873 to handle page faults in user mode only. In this case, users without
874 SYS_CAP_PTRACE must pass UFFD_USER_MODE_ONLY in order for userfaultfd to
888 min(3% of current process size, user_reserve_kbytes) of free memory.
889 This is intended to prevent a user from starting a single memory hogging
895 all free memory with a single process, minus admin_reserve_kbytes.
896 Any subsequent attempts to execute a command will result in
897 "fork: Cannot allocate memory".
899 Changing this takes effect whenever an application requests memory.
906 the memory which is used for caching of directory and inode objects.
912 never reclaim dentries and inodes due to memory pressure and this can easily
913 lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
925 This factor controls the level of reclaim when memory is being fragmented.
928 The intent is that compaction has less work to do in the future and to
929 increase the success rate of future high-order allocations such as SLUB
933 parameter, the unit is in fractions of 10,000. The default value of
934 15,000 means that up to 150% of the high watermark will be reclaimed in the
936 is determined by the number of fragmentation events that occurred in the
938 worth of pages will be reclaimed (e.g.  2MB on 64-bit x86). A boost factor
946 amount of memory left in a node/system before kswapd is woken up and
947 how much memory needs to be free before kswapd goes back to sleep.
949 The unit is in fractions of 10,000. The default value of 10 means the
950 distances between watermarks are 0.1% of the available memory in the
951 node/system. The maximum value is 1000, or 10% of memory.
956 too small for the allocation bursts occurring in the system. This knob
964 reclaim memory when a zone runs out of memory. If it is set to zero then no
966 in the system.
983 and that accessing remote memory would cause a measurable performance
991 since it cannot use all of system memory to buffer the outgoing writes
992 anymore but it preserve the memory on other nodes so that the performance
996 node unless explicitly overridden by memory policies or cpuset