memory-barriers.txt - OpenGrok cross reference for /Linux-v5.10/Documentation/memory-barriers.txt

Lines Matching full:load
59      - Read memory barriers vs load speculation.
159 	STORE A=3,	STORE B=4,	y=LOAD A->3,	x=LOAD B->4
160 	STORE A=3,	STORE B=4,	x=LOAD B->4,	y=LOAD A->3
161 	STORE A=3,	y=LOAD A->3,	STORE B=4,	x=LOAD B->4
162 	STORE A=3,	y=LOAD A->3,	x=LOAD B->2,	STORE B=4
163 	STORE A=3,	x=LOAD B->2,	STORE B=4,	y=LOAD A->3
164 	STORE A=3,	x=LOAD B->2,	y=LOAD A->3,	STORE B=4
165 	STORE B=4,	STORE A=3,	y=LOAD A->3,	x=LOAD B->4
198 Note that CPU 2 will never try and load C into D because the CPU will load P
199 into Q before issuing the load of *Q.
217 	STORE *A = 5, x = LOAD *D
218 	x = LOAD *D, STORE *A = 5
236 	Q = LOAD P, D = LOAD *Q
242 	Q = LOAD P, MEMORY_BARRIER, D = LOAD *Q, MEMORY_BARRIER
254 	a = LOAD *X, STORE *X = b
262 	STORE *X = c, d = LOAD *X
282 	X = LOAD *A,  Y = LOAD *B,  STORE *D = Z
283 	X = LOAD *A,  STORE *D = Z, Y = LOAD *B
284 	Y = LOAD *B,  X = LOAD *A,  STORE *D = Z
285 	Y = LOAD *B,  STORE *D = Z, X = LOAD *A
286 	STORE *D = Z, X = LOAD *A,  Y = LOAD *B
287 	STORE *D = Z, Y = LOAD *B,  X = LOAD *A
296 	X = LOAD *A; Y = LOAD *(A + 4);
297 	Y = LOAD *(A + 4); X = LOAD *A;
298 	{X, Y} = LOAD {*A, *(A + 4) };
402      of the first (eg: the first load retrieves the address to which the second
403      load will be directed), a data dependency barrier would be required to
404      make sure that the target of the second load is updated after the address
405      obtained by the first load is accessed.
414      under consideration guarantees that for any load preceding it, if that
415      load touches one of a sequence of stores from another CPU, then by the
417      touched by the load will be perceptible to any loads issued after the data
423      [!] Note that the first load really has to have a _data_ dependency and
424      not a control dependency.  If the address for the second load is dependent
425      on the first load, but the dependency is through a conditional rather than
434  (3) Read (or load) memory barriers.
437      LOAD operations specified before the barrier will appear to happen before
438      all the LOAD operations specified after the barrier with respect to the
453      A general memory barrier gives a guarantee that all the LOAD and STORE
455      the LOAD and STORE operations specified after the barrier with respect to
506 semantics) definitions.  For compound atomics performing both a load and a
507 store, ACQUIRE semantics apply only to the load and RELEASE semantics apply
592 between the address load and the data load:
669 A load-load control dependency requires a full read memory barrier, not
682 the load from b as having happened before the load from a.  In such a
692 for load-store control dependencies, as in the following example:
702 load from 'a' with other loads from 'a'.  Without the WRITE_ONCE(),
735 	WRITE_ONCE(b, 1);  /* BUG: No ordering vs. load from a!!! */
744 Now there is no conditional between the load from 'a' and the store to
797 between the load from variable 'a' and the store to variable 'b'.  It is
804 	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
833 the compiler to actually emit code for a given load, it does not force
862 A weakly ordered CPU would have no dependency of any sort between the load
894       between the prior load and the subsequent store, and this
895       conditional must involve the prior load.  If the compiler is able
1033 	STORE C = &B		LOAD X
1034 	STORE D = 4		LOAD C (gets &B)
1035 				LOAD *C (reads B)
1060 	    The load of X holds --->    \       | X->9  |------>|       |
1067 In the above example, CPU 2 perceives that B is 7, despite the load of *C
1068 (which would be B) coming after the LOAD of C.
1070 If, however, a data dependency barrier were to be placed between the load of C
1071 and the load of *C (ie: B) on CPU 2:
1079 	STORE C = &B		LOAD X
1080 	STORE D = 4		LOAD C (gets &B)
1082 				LOAD *C (reads B)
1120 				LOAD B
1121 				LOAD A
1147 If, however, a read barrier were to be placed between the load of B and the
1148 load of A on CPU 2:
1156 				LOAD B
1158 				LOAD A
1184 contained a load of A either side of the read barrier:
1192 				LOAD B
1193 				LOAD A [first load of A]
1195 				LOAD A [second load of A]
1197 Even though the two loads of A both occur after the load of B, they may both
1249 The guarantee is that the second load will always come up with A == 1 if the
1250 load of B came up with B == 2.  No such guarantee exists for the first load of
1254 READ MEMORY BARRIERS VS LOAD SPECULATION
1257 Many CPUs speculate with loads: that is they see that they will need to load an
1259 other loads, and so do the load in advance - even though they haven't actually
1261 actual load instruction to potentially complete immediately because the CPU
1265 branch circumvented the load - in which case it can discard the value or just
1272 				LOAD B
1275 				LOAD A
1287 	LOAD of A                               :       :   ~   |       |
1292 	LOAD with immediate effect              :       :       +-------+
1296 load:
1300 				LOAD B
1304 				LOAD A
1318 	LOAD of A                               :       :   ~   |       |
1340 	LOAD of A                               :       :   ~   |       |
1369 	STORE X=1		r1=LOAD X (reads 1)	LOAD Y (reads 1)
1371 				STORE Y=r1		LOAD X
1373 Suppose that CPU 2's load from X returns 1, which it then stores to Y,
1374 and CPU 3's load from Y returns 1.  This indicates that CPU 1's store
1375 to X precedes CPU 2's load from X and that CPU 2's store to Y precedes
1376 CPU 3's load from Y.  In addition, the memory barriers guarantee that
1377 CPU 2 executes its load before its store, and CPU 3 loads from Y before
1378 it loads from X.  The question is then "Can CPU 3's load from X return 0?"
1380 Because CPU 3's load from X in some sense comes after CPU 2's load, it
1381 is natural to expect that CPU 3's load from X must therefore return 1.
1382 This expectation follows from multicopy atomicity: if a load executing
1383 on CPU B follows a load from the same variable executing on CPU A (and
1385 multicopy-atomic systems, CPU B's load must return either the same value
1386 that CPU A's load did or some later value.  However, the Linux kernel
1390 for any lack of multicopy atomicity.  In the example, if CPU 2's load
1391 from X returns 1 and CPU 3's load from Y returns 1, then CPU 3's load
1402 	STORE X=1		r1=LOAD X (reads 1)	LOAD Y (reads 1)
1404 				STORE Y=r1		LOAD X (reads 0)
1407 this example, it is perfectly legal for CPU 2's load from X to return 1,
1408 CPU 3's load from Y to return 1, and its load from X to return 0.
1410 The key point is that although CPU 2's data dependency orders its load
1483 store to u as happening -after- cpu1()'s load from v, even though
1533  (*) Within a loop, forces the compiler to load the variables used
1608  (*) The compiler is within its rights to omit a load entirely if it knows
1620      gets rid of a load and a branch.  The problem is that the compiler
1638      the code into near-nonexistence.  (It will still load from the
1760      with a single memory-reference instruction, prevents "load tearing"
1779      Use of packed structures can also result in load and store tearing,
1798      load tearing on 'foo1.b' and store tearing on 'foo2.b'.  READ_ONCE()
1833 to issue the loads in the correct order (eg. `a[b]` would have to load
1836 (eg. is equal to 1) and load a[b] before b (eg. tmp = a[1]; if (b != 1)
1950      For load from persistent memory, existing read memory barriers are sufficient
2148 	LOAD event_indicated
2191 	LOAD event_indicated		  if ((LOAD task->state) & TASK_NORMAL)
2205 	LOAD Y				LOAD X
2377 	LOAD waiter->list.next;
2378 	LOAD waiter->task;
2401 	LOAD waiter->task;
2410 	LOAD waiter->list.next;
2418 	LOAD waiter->list.next;
2419 	LOAD waiter->task;
2501 	STORE *ADDR = 3, STORE *ADDR = 4, STORE *DATA = y, q = LOAD *DATA
2510 sections will include synchronous load operations on strictly ordered I/O
2655 ultimate effect.  For example, if two adjacent instructions both load an
2699 Although any particular load or store may not actually appear outside of the
2707 generate load and store operations which then go into the queue of memory
2779 	LOAD *A, STORE *B, LOAD *C, LOAD *D, STORE *E.
2811 	LOAD *A, ..., LOAD {*C,*D}, STORE *E, STORE *B
2813 	(Where "LOAD {*C,*D}" is a combined load)
2838 	U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A
2874 and the LOAD operation never appear outside of the CPU.