Lines Matching full:that
21 exploration is needed to discover, is that it is complex. There are
22 many rules, special cases, and implementation alternatives that all
25 tool that we will make extensive use of is "divide and conquer". For
40 of elements: "slashes" that are sequences of one or more "``/``"
41 characters, and "components" that are sequences of one or more
42 non-"``/``" characters. These form two kinds of paths. Those that
51 component, but that isn't always accurate: a pathname can lack both
61 it must identify a directory that already exists, otherwise an error
67 pathname that is just slashes have a final component. If it does
74 tempting to consider that to have an empty final component. In many
75 ways that would lead to correct results, but not always. In
80 A pathname that contains at least one non- <slash> character and
81 that ends with one or more trailing <slash> characters shall not
84 directory entry that is to be created for a directory immediately
90 checking that the trailing slash is not used where it isn't
95 changes that affect that lookup. One fairly extreme case is that if
97 "a/b/..", that process might successfully resolve on "a/c".
101 "dcache" and an understanding of that is central to understanding
111 contains further information about the object in that parent with
112 the given name. The inode pointer can be ``NULL`` indicating that the
114 dentry of a directory to the dentries of the children, that linkage is
118 that will be particularly relevant is that it is closely integrated
119 with the mount table that records which filesystem is mounted where.
126 Some filesystems ensure that the information in the dcache is always
129 without checking with the filesystem, and means that the VFS can
133 Other filesystems don't provide that guarantee because they cannot.
134 These are typically filesystems that are shared across a network,
149 you ignore all the places that only run when "``LOOKUP_RCU``"
167 reference count. The special-sauce of this primitive is that the
171 Holding a reference on a dentry ensures that the dentry won't suddenly
198 ``d_lock`` is a synonym for the spinlock that is part of ``d_lockref`` above.
205 each candidate dentry that it finds in the hash table and then checks
206 that the parent and name are correct. So it doesn't lock the parent
221 accessing that slot in a hash table, and searching the linked list
222 that is found there.
227 happened to be looking at a dentry that was moved in this way,
233 ``rename_lock`` is a seqlock that is updated whenever any dentry is
234 renamed. If ``d_lookup`` finds that a rename happened while it
241 ``i_rwsem`` is a read/write semaphore that serializes all changes to a particular
242 directory. This ensures that, for example, an ``unlink()`` and a ``rename()``
244 stable while the filesystem is asked to look up a name that is not
248 This has a complementary role to that of ``d_lock``: ``i_rwsem`` on a
249 directory protects all of the names in that directory, while ``d_lock``
260 falls back to ``lookup_slow()`` which takes a shared lock on ``i_rwsem``, checks again that
267 that the required exclusion can be achieved. How path lookup chooses
272 name that is not yet in the dcache - the shared lock on ``i_rwsem`` will
285 If a matching dentry was found in the primary hash table then that is
286 returned and the caller can know that it lost a race with some other
290 knows that it has won any race and now is responsible for asking the
295 added to the primary hash table already. Note that a ``struct
302 ``DCACHE_PAR_LOOKUP`` to be cleared, using a wait_queue that was passed
303 to the instance of ``d_alloc_parallel()`` that won the race and that
306 has, the dentry is returned and the caller just sees that it lost any
308 likely explanation is that some other dentry was added instead using
317 Per-CPU here means that incrementing the count is cheap as it only
322 ``mnt_count`` doesn't ensure that the mount remains in the namespace and,
324 does, however, ensure that the ``mount`` data structure remains coherent,
336 crossing a mount point to check that the crossing was safe. That is,
337 the value in the seqlock is read, then the code finds the mount that
368 all the way back to `First Edition Unix`_ - of the function that
387 that is the "next" component in the pathname.
402 filesystem. Often that reference won't be needed, so this field is
404 is requested. Keeping a reference in the ``nameidata`` ensures that
416 escape that subtree. It works a bit like a local ``chroot()``.
422 Given a path (``name``) and a nameidata structure (``nd``), check that the
424 over one component while updating ``last_type`` and ``last``. If that
432 filesystem to revalidate the result if it is that sort of filesystem.
433 If that doesn't get a good result, it calls "``lookup_slow()``" which
447 seem obvious, but is worth pointing out so that we will recognize its
455 not call ``walk_component()`` that last time. Handling that final
477 implementation of ``lookup_slow()`` which skips that step. This is
478 important when unmounting a filesystem that is inaccessible, such as
490 the possibility that the final component is not ``LAST_NORM``. If the
494 won't try to create that name. They also check for trailing slashes
505 On filesystems that require it, the lookup routines will call the
506 ``->d_revalidate()`` dentry method to ensure that the cached information
508 from a server. In some cases it may find that there has been change
509 further up the path and that something that was thought to be valid
516 lookup a name can trigger changes to how that lookup should be
525 to three different flags that might be set in ``dentry->d_flags``:
530 If this flag has been set, then the filesystem has requested that the
535 unmounted, the ``d_manage()`` function will usually wait for that
542 processing. That server process can identify itself to the ``autofs``
549 This flag is set on every dentry that is mounted on. As Linux
550 supports multiple filesystem namespaces, it is possible that the
568 report that there was an error, that there was nothing to mount, or
574 There is no new locking of import here and it is important that no
588 We noted that REF-walk is complex because there are numerous details
600 thread from changing the data structures that a given thread is
603 same time, this can be very costly. Even when using locks that permit
606 goal when reading a shared data structure that no other process is
616 other parts it is important that RCU-walk can quickly fall back to
623 notices that something has changed or is changing, or if something
628 ``vfsmount`` and ``dentry``, and ensuring that these are still valid -
629 that a path walk with REF-walk would have found the same entries.
630 This is an invariant that RCU-walk must guarantee. It can only make
631 decisions, such as selecting the next step, that are decisions which
638 This pattern of "try RCU-walk, if that fails try REF-walk" can be
646 that fails with the error ``ECHILD`` they are called again with no
649 ``LOOKUP_RCU``) to ensure that entries found in the cache are forcibly
651 determines that they are too old to trust.
653 The ``LOOKUP_RCU`` attempt may drop that flag internally and switch to
655 that trip up RCU-walk are much more likely to be near the leaves and
656 so it is very unlikely that there will be much, if any, benefit from
663 ``rcu_read_lock()`` is held for the entire time that RCU-walk is walking
664 down a path. The particular guarantee it provides is that the key
669 is the only guarantee that RCU provides; everything else is done using
681 To preserve the invariant mentioned above (that RCU-walk may only make
682 decisions that REF-walk could have made), it must make the checks at
683 or near the same places that REF-walk holds the references. So, when
690 However, there is a little bit more to seqlocks than that. If
695 use ``read_seqcount_retry()`` to validate that copy.
698 imposes a memory barrier so that no memory-read instruction from
710 sufficient to catch any problem that could occur at this point.
712 With that little refresher on seqlocks out of the way we can look at
719 ensure that crossing a mount point is performed safely. RCU-walk uses
720 it for that too, but for quite a bit more.
729 that any "mount" or "unmount" happens.
739 If RCU-walk finds that ``mount_lock`` hasn't changed then it can be sure
740 that, had REF-walk taken counted references on each vfsmount, the
762 check if we have landed on a mount point and, if so, must find that
765 starting point of the path lookup was in part of the filesystem that
776 ``lookup_fast()`` is the only lookup routine that is used in RCU-mode,
778 ``lookup_fast()`` that we find the important "hand over hand" tracking
788 getting a counted reference to the new dentry before dropping that for
794 A semaphore is a fairly heavyweight lock that can only be taken when it is
797 take ``i_rwsem`` and modifies the directory in a way that RCU-walk needs
798 to notice, the result will be either that RCU-walk fails to find the
799 dentry that it is looking for, or it will find a dentry which
807 something that actually is there. When RCU-walk fails to find
816 That "dropping down to REF-walk" typically involves a call to
829 Other reasons for dropping out of RCU-walk that do not trigger a call
830 to ``unlazy_walk()`` are when some inconsistency is found that cannot be
837 takes a reference on each of the pointers that it holds (vfsmount,
838 dentry, and possibly some symbolic links) and then verifies that the
844 incrementing a counter. That works to take a second reference if you
853 ``mount_lock`` is then used to validate the reference. If that
854 validation fails, it may *not* be safe to just drop that reference in
857 finds that the reference it got might not be safe, checks the
874 In this case an extra "``MAY_NOT_BLOCK``" flag is passed so that it
898 the big picture, there are a couple of related patterns that are worth
901 The first is "try quickly and check, if that fails try slowly". We
902 can see that in the high-level approach of first trying RCU-walk and
908 The second pattern is "try quickly and check, if that fails try
915 "try quickly _and carefully,_ then check". The fact that checking is
916 needed is a reminder that the system is dynamic and only a limited
925 There are several basic issues that we will examine to understand the
935 There are only two sorts of filesystem objects that can usefully
943 a component name refers to a symbolic link, then that component is
944 replaced by the body of the link and, if that body starts with a '/',
981 further limit of eight on the maximum depth of recursion, but that was
985 The ``nameidata`` structure that we met in an earlier article contains a
986 small stack that can be used to store the remaining part of up to two
989 lookup will never exceed that stack as, once the 40th symlink is
992 It might seem that the name remnants are all that needs to be stored on
993 this stack, but we need a bit more. To see that, we need to move on to
1002 able to find and temporarily hold onto these cached entries, so that
1014 pathname in a symlink can be seen as the content of that symlink and
1018 that the filesystem will allocate some temporary memory and copy or
1019 construct the symlink content into that memory whenever it is needed.
1023 on the dentry. This means that the mechanisms that pathname lookup
1031 on an inode does not imply any reference on cached pages of that
1032 inode, and even an ``rcu_read_lock()`` is not sufficient to ensure that
1035 significantly, needs to release that reference when it is finished
1040 but that isn't necessarily a big cost and it is better than dropping
1041 out of RCU-walk mode completely. Even filesystems that allocate
1050 RCU-walk mode as the rewrite is not quite complete. It is likely that
1054 looked at previously, ``->follow_link()`` would need to be careful that
1058 code is ready to release the reference when that does happen.
1061 complexity. It requires a reference to the inode so that the
1062 ``i_op->put_link()`` inode operation can be called. In REF-walk, that
1067 we also need the seq number for the dentry so we can confirm that
1071 provides an opaque "cookie" that must be passed to ``->put_link()`` so that it
1083 - the ``cookie`` that tells ``->put_path()`` what to put.
1085 This means that each entry in the symlink stack needs to hold five
1092 Note that, in a given stack frame, the path remnant (``name``) is not
1093 part of the symlink that the other fields refer to. It is the remnant
1094 to be followed once that symlink has been fully parsed.
1102 symlink, or is restored from the stack, so that much of the loop
1110 called; it then gets the link from the filesystem. Providing that
1120 the symlink-just-found to avoid leaving empty path remnants that would
1125 ``walk_component()`` is also the last piece of code that needs to look at the
1126 old symlink as it walks that last component. So it is quite
1148 so ``NULL`` is returned to indicate that the symlink can be released and
1151 The other case involves things in ``/proc`` that look like symlinks but
1158 something that looks like a symlink. It is really a reference to the
1160 objects you get a name that might refer to the same file - unless it
1164 ``nameidata`` in place to point to that target. ``->follow_link()`` then
1175 For some callers, this is all they need; they want to create that
1178 apply special handling to the last component of that symlink, rather
1181 successive symlinks until one is found that doesn't point to another
1185 ``path_lookupat()`` using a loop that calls ``link_path_walk()``, and then
1187 that needs to be followed, then ``trailing_symlink()`` is called to set
1192 The various functions that examine the final component and possibly
1193 report that it is a symlink are ``lookup_last()``, ``mountpoint_last()``
1195 ``walk_component()`` of returning ``1`` if a symlink was found that needs
1219 If that doesn't work, only then is the lookup restarted from the top.
1230 so does ``do_last()`` so that ``trailing_symlink()`` gets called and the
1231 open process continues on the symlink that was found.
1236 We previously said of RCU-walk that it would "take no locks, increment
1237 no counts, leave no footprints." We have since seen that some
1243 footprints in a way that doesn't affect directories is in updating access times.
1250 update the atime on that symlink.
1255 subject. The `clearest statement`_ is that, if a particular implementation
1257 documented "except that any changes caused by pathname resolution need
1258 not be documented". This seems to imply that POSIX doesn't really
1263 An examination of history shows that prior to `Linux 1.3.87`_, the ext2
1265 Unfortunately we have no record of why that behavior was changed.
1267 In any case, access time must now be updated and that operation can be
1272 limits the updates of ``atime`` to once per day on files that aren't
1287 the various flags that can be stored in the ``nameidata`` to guide the
1302 ``LOOKUP_PARENT`` indicates that the final component hasn't been reached
1306 ``LOOKUP_ROOT`` indicates that the ``root`` field in the ``nameidata`` was
1310 ``LOOKUP_JUMPED`` means that the current dentry was chosen not because
1324 considered. Others are only checked for when considering that final
1327 ``LOOKUP_AUTOMOUNT`` ensures that, if the final component is an automount
1338 ``WALK_GET`` that we already met, but it is used in a different way.
1340 ``LOOKUP_DIRECTORY`` insists that the final component is a directory.
1348 if it knows that it will be asked to open or create the file soon.
1357 than even a couple of releases ago. But that doesn't mean it is
1359 symlinks that are stored in the inode so, while it handles many ext4
1360 symlinks, it doesn't help with NFS, XFS, or Btrfs. That support