1================================= 2GFP masks used from FS/IO context 3================================= 4 5:Date: May, 2018 6:Author: Michal Hocko <mhocko@kernel.org> 7 8Introduction 9============ 10 11Code paths in the filesystem and IO stacks must be careful when 12allocating memory to prevent recursion deadlocks caused by direct 13memory reclaim calling back into the FS or IO paths and blocking on 14already held resources (e.g. locks - most commonly those used for the 15transaction context). 16 17The traditional way to avoid this deadlock problem is to clear __GFP_FS 18respectively __GFP_IO (note the latter implies clearing the first as well) in 19the gfp mask when calling an allocator. GFP_NOFS respectively GFP_NOIO can be 20used as shortcut. It turned out though that above approach has led to 21abuses when the restricted gfp mask is used "just in case" without a 22deeper consideration which leads to problems because an excessive use 23of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory 24reclaim issues. 25 26New API 27======== 28 29Since 4.12 we do have a generic scope API for both NOFS and NOIO context 30``memalloc_nofs_save``, ``memalloc_nofs_restore`` respectively ``memalloc_noio_save``, 31``memalloc_noio_restore`` which allow to mark a scope to be a critical 32section from a filesystem or I/O point of view. Any allocation from that 33scope will inherently drop __GFP_FS respectively __GFP_IO from the given 34mask so no memory allocation can recurse back in the FS/IO. 35 36.. kernel-doc:: include/linux/sched/mm.h 37 :functions: memalloc_nofs_save memalloc_nofs_restore 38.. kernel-doc:: include/linux/sched/mm.h 39 :functions: memalloc_noio_save memalloc_noio_restore 40 41FS/IO code then simply calls the appropriate save function before 42any critical section with respect to the reclaim is started - e.g. 43lock shared with the reclaim context or when a transaction context 44nesting would be possible via reclaim. The restore function should be 45called when the critical section ends. All that ideally along with an 46explanation what is the reclaim context for easier maintenance. 47 48Please note that the proper pairing of save/restore functions 49allows nesting so it is safe to call ``memalloc_noio_save`` or 50``memalloc_noio_restore`` respectively from an existing NOIO or NOFS 51scope. 52 53What about __vmalloc(GFP_NOFS) 54============================== 55 56vmalloc doesn't support GFP_NOFS semantic because there are hardcoded 57GFP_KERNEL allocations deep inside the allocator which are quite non-trivial 58to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is 59almost always a bug. The good news is that the NOFS/NOIO semantic can be 60achieved by the scope API. 61 62In the ideal world, upper layers should already mark dangerous contexts 63and so no special care is required and vmalloc should be called without 64any problems. Sometimes if the context is not really clear or there are 65layering violations then the recommended way around that is to wrap ``vmalloc`` 66by the scope API with a comment explaining the problem. 67