memory-barriers.txt - OpenGrok cross reference for /Linux-v6.1/Documentation/memory-barriers.txt

Lines Matching full:load
59      - Read memory barriers vs load speculation.
159 	STORE A=3,	STORE B=4,	y=LOAD A->3,	x=LOAD B->4
160 	STORE A=3,	STORE B=4,	x=LOAD B->4,	y=LOAD A->3
161 	STORE A=3,	y=LOAD A->3,	STORE B=4,	x=LOAD B->4
162 	STORE A=3,	y=LOAD A->3,	x=LOAD B->2,	STORE B=4
163 	STORE A=3,	x=LOAD B->2,	STORE B=4,	y=LOAD A->3
164 	STORE A=3,	x=LOAD B->2,	y=LOAD A->3,	STORE B=4
165 	STORE B=4,	STORE A=3,	y=LOAD A->3,	x=LOAD B->4
198 Note that CPU 2 will never try and load C into D because the CPU will load P
199 into Q before issuing the load of *Q.
217 	STORE *A = 5, x = LOAD *D
218 	x = LOAD *D, STORE *A = 5
236 	Q = LOAD P, D = LOAD *Q
242 	Q = LOAD P, MEMORY_BARRIER, D = LOAD *Q, MEMORY_BARRIER
254 	a = LOAD *X, STORE *X = b
262 	STORE *X = c, d = LOAD *X
282 	X = LOAD *A,  Y = LOAD *B,  STORE *D = Z
283 	X = LOAD *A,  STORE *D = Z, Y = LOAD *B
284 	Y = LOAD *B,  X = LOAD *A,  STORE *D = Z
285 	Y = LOAD *B,  STORE *D = Z, X = LOAD *A
286 	STORE *D = Z, X = LOAD *A,  Y = LOAD *B
287 	STORE *D = Z, Y = LOAD *B,  X = LOAD *A
296 	X = LOAD *A; Y = LOAD *(A + 4);
297 	Y = LOAD *(A + 4); X = LOAD *A;
298 	{X, Y} = LOAD {*A, *(A + 4) };
402      result of the first (eg: the first load retrieves the address to which
403      the second load will be directed), an address-dependency barrier would
404      be required to make sure that the target of the second load is updated
405      after the address obtained by the first load is accessed.
414      the CPU under consideration guarantees that for any load preceding it,
415      if that load touches one of a sequence of stores from another CPU, then
417      that touched by the load will be perceptible to any loads issued after
423      [!] Note that the first load really has to have an _address_ dependency and
424      not a control dependency.  If the address for the second load is dependent
425      on the first load, but the dependency is through a conditional rather than
438  (3) Read (or load) memory barriers.
441      the LOAD operations specified before the barrier will appear to happen
442      before all the LOAD operations specified after the barrier with respect to
457      A general memory barrier gives a guarantee that all the LOAD and STORE
459      the LOAD and STORE operations specified after the barrier with respect to
510 semantics) definitions.  For compound atomics performing both a load and a
511 store, ACQUIRE semantics apply only to the load and RELEASE semantics apply
567 [!] While address dependencies are observed in both load-to-load and
568 load-to-store relations, address-dependency barriers are not necessary
569 for load-to-store situations.
680 A load-load control dependency requires a full read memory barrier, not
694 the load from b as having happened before the load from a.  In such a case
704 for load-store control dependencies, as in the following example:
714 load from 'a' with other loads from 'a'.  Without the WRITE_ONCE(),
747 	WRITE_ONCE(b, 1);  /* BUG: No ordering vs. load from a!!! */
756 Now there is no conditional between the load from 'a' and the store to
809 between the load from variable 'a' and the store to variable 'b'.  It is
816 	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
845 the compiler to actually emit code for a given load, it does not force
874 A weakly ordered CPU would have no dependency of any sort between the load
906       between the prior load and the subsequent store, and this
907       conditional must involve the prior load.  If the compiler is able
1045 	STORE C = &B		LOAD X
1046 	STORE D = 4		LOAD C (gets &B)
1047 				LOAD *C (reads B)
1072 	    The load of X holds --->    \       | X->9  |------>|       |
1079 In the above example, CPU 2 perceives that B is 7, despite the load of *C
1080 (which would be B) coming after the LOAD of C.
1082 If, however, an address-dependency barrier were to be placed between the load
1083 of C and the load of *C (ie: B) on CPU 2:
1091 	STORE C = &B		LOAD X
1092 	STORE D = 4		LOAD C (gets &B)
1094 				LOAD *C (reads B)
1132 				LOAD B
1133 				LOAD A
1159 If, however, a read barrier were to be placed between the load of B and the
1160 load of A on CPU 2:
1168 				LOAD B
1170 				LOAD A
1196 contained a load of A either side of the read barrier:
1204 				LOAD B
1205 				LOAD A [first load of A]
1207 				LOAD A [second load of A]
1209 Even though the two loads of A both occur after the load of B, they may both
1261 The guarantee is that the second load will always come up with A == 1 if the
1262 load of B came up with B == 2.  No such guarantee exists for the first load of
1266 READ MEMORY BARRIERS VS LOAD SPECULATION
1269 Many CPUs speculate with loads: that is they see that they will need to load an
1271 other loads, and so do the load in advance - even though they haven't actually
1273 actual load instruction to potentially complete immediately because the CPU
1277 branch circumvented the load - in which case it can discard the value or just
1284 				LOAD B
1287 				LOAD A
1299 	LOAD of A                               :       :   ~   |       |
1304 	LOAD with immediate effect              :       :       +-------+
1308 load:
1312 				LOAD B
1316 				LOAD A
1330 	LOAD of A                               :       :   ~   |       |
1352 	LOAD of A                               :       :   ~   |       |
1381 	STORE X=1		r1=LOAD X (reads 1)	LOAD Y (reads 1)
1383 				STORE Y=r1		LOAD X
1385 Suppose that CPU 2's load from X returns 1, which it then stores to Y,
1386 and CPU 3's load from Y returns 1.  This indicates that CPU 1's store
1387 to X precedes CPU 2's load from X and that CPU 2's store to Y precedes
1388 CPU 3's load from Y.  In addition, the memory barriers guarantee that
1389 CPU 2 executes its load before its store, and CPU 3 loads from Y before
1390 it loads from X.  The question is then "Can CPU 3's load from X return 0?"
1392 Because CPU 3's load from X in some sense comes after CPU 2's load, it
1393 is natural to expect that CPU 3's load from X must therefore return 1.
1394 This expectation follows from multicopy atomicity: if a load executing
1395 on CPU B follows a load from the same variable executing on CPU A (and
1397 multicopy-atomic systems, CPU B's load must return either the same value
1398 that CPU A's load did or some later value.  However, the Linux kernel
1402 for any lack of multicopy atomicity.  In the example, if CPU 2's load
1403 from X returns 1 and CPU 3's load from Y returns 1, then CPU 3's load
1414 	STORE X=1		r1=LOAD X (reads 1)	LOAD Y (reads 1)
1416 				STORE Y=r1		LOAD X (reads 0)
1419 this example, it is perfectly legal for CPU 2's load from X to return 1,
1420 CPU 3's load from Y to return 1, and its load from X to return 0.
1422 The key point is that although CPU 2's data dependency orders its load
1495 store to u as happening -after- cpu1()'s load from v, even though
1545  (*) Within a loop, forces the compiler to load the variables used
1620  (*) The compiler is within its rights to omit a load entirely if it knows
1632      gets rid of a load and a branch.  The problem is that the compiler
1650      the code into near-nonexistence.  (It will still load from the
1772      with a single memory-reference instruction, prevents "load tearing"
1791      Use of packed structures can also result in load and store tearing,
1810      load tearing on 'foo1.b' and store tearing on 'foo2.b'.  READ_ONCE()
1845 to issue the loads in the correct order (eg. `a[b]` would have to load
1848 (eg. is equal to 1) and load a[b] before b (eg. tmp = a[1]; if (b != 1)
1963      For load from persistent memory, existing read memory barriers are sufficient
2169 	LOAD event_indicated
2212 	LOAD event_indicated		  if ((LOAD task->state) & TASK_NORMAL)
2226 	LOAD Y				LOAD X
2398 	LOAD waiter->list.next;
2399 	LOAD waiter->task;
2422 	LOAD waiter->task;
2431 	LOAD waiter->list.next;
2439 	LOAD waiter->list.next;
2440 	LOAD waiter->task;
2522 	STORE *ADDR = 3, STORE *ADDR = 4, STORE *DATA = y, q = LOAD *DATA
2531 sections will include synchronous load operations on strictly ordered I/O
2676 ultimate effect.  For example, if two adjacent instructions both load an
2720 Although any particular load or store may not actually appear outside of the
2728 generate load and store operations which then go into the queue of memory
2801 	LOAD *A, STORE *B, LOAD *C, LOAD *D, STORE *E.
2833 	LOAD *A, ..., LOAD {*C,*D}, STORE *E, STORE *B
2835 	(Where "LOAD {*C,*D}" is a combined load)
2860 	U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A
2896 and the LOAD operation never appear outside of the CPU.