utf8-norm.c - OpenGrok cross reference for /Linux-v5.10/fs/unicode/utf8-norm.c

Lines Matching +full:10 +full:a
41  * The UTF-8 encoding spreads the bits of a 32bit word over several
46  * 0x00000000 0x000007FF: 110xxxxx 10xxxxxx
47  * 0x00000000 0x0000FFFF: 1110xxxx 10xxxxxx 10xxxxxx
48  * 0x00000000 0x001FFFFF: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
49  * 0x00000000 0x03FFFFFF: 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
50  * 0x00000000 0x7FFFFFFF: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
53  * shortest representation of a 32bit value is to be used.  A decoder
55  * Thus the allowed ranges have a lower bound.
58  * 0x00000080 0x000007FF: 110xxxxx 10xxxxxx
59  * 0x00000800 0x0000FFFF: 1110xxxx 10xxxxxx 10xxxxxx
60  * 0x00010000 0x001FFFFF: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
61  * 0x00200000 0x03FFFFFF: 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
62  * 0x04000000 0x7FFFFFFF: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
76  * the same a single UTF-32 character.  This makes the UTF-8
88  * Assumes the input points to the first byte of a valid UTF-8
99  * Decode a 3-byte UTF-8 sequence.
116  * Encode a 3-byte UTF-8 sequence.
133  * A compact binary tree, used to decode UTF-8 characters.
145  *                            node, otherwise it is a leaf node
151  * NEXTBYTE set, and moreover those nodes always have a righthand
170  * leaf[0]: The unicode version, stored as a generation number that is
173  *          defined.  The CCC of a non-defined code point is 0.
175  *          to do a stable sort into ascending order of all characters
176  *          with a non-zero CCC that occur between two characters with
177  *          a CCC of 0, or at the begin or end of a string.
180  *          a special value.
183  *          start of a NUL-terminated string that is the decomposition
185  *          The CCC of a decomposable character is the same as the CCC
196  * The trie is constructed in such a way that leaves exist for all
198  * ranges" comment above, and only for those sequences.  Therefore a
295 	/* Add LPart, a 3-byte UTF-8 sequence. */  in utf8hangul()
298 	/* Add VPart, a 3-byte UTF-8 sequence. */  in utf8hangul()
301 	/* Add TPart if required, also a 3-byte UTF-8 sequence. */  in utf8hangul()
315  * A non-NULL return guarantees that the UTF-8 sequence starting at s
316  * is well-formed and corresponds to a known unicode code point.  The
518  * A string of Default_Ignorable_Code_Point has length 0.
631  * When a character is decomposed, the current location is stored in
633  * that bytes from a decomposition do not count against u8c->len.
648  *  u8c->p  != NULL -> a decomposition is being scanned.
649  *  u8c->ss != NULL -> this is a repeating scan.
650  *  u8c->ccc == -1   -> this is the first scan of a repeating scan.
658 		/* Check for the end of a decomposed character. */  in utf8byte()
669 			/* End-of-string during a scan counts as a stopper. */  in utf8byte()
673 			/* This is a continuation of the current character. */  in utf8byte()
687 		/* No leaf found implies that the input is a binary blob. */  in utf8byte()
714 		 * If this is not a stopper, then see if it updates  in utf8byte()
747 			/* Not a stopper, and not the ccc we're emitting. */  in utf8byte()
752 			/* At a stopper, restart for next ccc. */  in utf8byte()
In current file

In project "undefined"

On Google