Lines Matching +full:a +full:- +full:c

14 requirements for kernel code: its goal is to serve as a primer for Linux
15 kernel development for experienced C programmers. I avoid implementation
20 this document, being grossly under-qualified, but I always wanted to
21 read it, and this was the only way. I hope it will grow into a
28 At any time each of the CPUs in a system can be:
30 - not associated with any process, serving a hardware interrupt;
32 - not associated with any process, serving a softirq or tasklet;
34 - running in kernel space, associated with a process (user context);
36 - running a process in user space.
39 other, but above that is a strict hierarchy: each can only be preempted
40 by the ones above it. For example, while a softirq is running on a CPU,
41 no other softirq will preempt it, but a hardware interrupt can. However,
44 We'll see a number of ways that the user context can block interrupts,
45 to become truly non-preemptable.
48 ------------
50 User context is when you are coming in from a system call or other trap:
52 interrupts. You can sleep, by calling :c:func:`schedule()`.
60 currently executing) is valid, and :c:func:`in_interrupt()`
66 :c:func:`in_interrupt()` will return a false positive.
69 -------------------------------
74 handler is never re-entered: if the same interrupt arrives, it is queued
76 fast: frequently it simply acknowledges the interrupt, marks a 'software
79 You can tell you are in a hardware interrupt, because in_hardirq() returns
84 Beware that this will return a false positive if interrupts are
88 -------------------------------------------------
90 Whenever a system call is about to return to userspace, or a hardware
92 pending (usually by hardware interrupts) are run (``kernel/softirq.c``).
96 take advantage of multiple CPUs. Shortly after we switched from wind-up
97 computers made of match-sticks and snot, we abandoned this limitation
100 ``include/linux/interrupt.h`` lists the different softirqs. A very
102 can register to have it call functions for you in a given length of
105 Softirqs are often a pain to deal with, since the same softirq will run
108 dynamically-registrable (meaning you can have as many as you want), and
118 You can tell you are in a softirq (or tasklet) using the
119 :c:func:`in_softirq()` macro (``include/linux/preempt.h``).
123 Beware that this will return a false positive if a
139 avoid context switches). It is generally a bad idea; use fixed point
142 A rigid stack limit
144 6K for most 32-bit architectures: it's about 14K on most 64-bit
150 Let's keep it that way. Your code should be 64-bit clean, and
151 endian-independent. You should also minimize CPU specific stuff,
154 architecture-dependent part of the kernel tree.
156 ioctls: Not writing a new system call
159 A system call generally looks like this::
167 First, in most cases you don't want to create a new system call. You
168 create a character device and implement an appropriate ioctl for it.
175 implementing a :c:func:`sysfs()` interface instead.
177 Inside the ioctl you're in user context to a process. When a error
178 occurs you return a negated errno (see
179 ``include/uapi/asm-generic/errno-base.h``,
180 ``include/uapi/asm-generic/errno.h`` and ``include/linux/errno.h``),
183 After you slept you should check if a signal occurred: the Unix/Linux
185 ``-ERESTARTSYS`` error. The system call entry code will switch back to
194 return -ERESTARTSYS;
205 A short note on interface design: the UNIX system call motto is "Provide
213 - You are in user context.
215 - You do not own any spinlocks.
217 - You have interrupts enabled (actually, Andi Kleen says that the
234 :c:func:`printk()`
235 ------------------
239 :c:func:`printk()` feeds kernel messages to the console, dmesg, and
241 can be used inside interrupt context, but use with caution: a machine
243 a format string mostly compatible with ANSI C printf, and C string
244 concatenation to give it a first "priority" argument::
257 :c:func:`printk()` internally uses a 1K buffer and does not catch
262 You will know when you are a real kernel hacker when you start
267 Another sidenote: the original Unix Version 6 sources had a comment
269 chit-chat". You should follow that advice.
271 :c:func:`copy_to_user()` / :c:func:`copy_from_user()` / :c:func:`get_user()` / :c:func:`put_user()`
272 ---------------------------------------------------------------------------------------------------
278 :c:func:`put_user()` and :c:func:`get_user()` are used to get
280 userspace. A pointer into userspace should never be simply dereferenced:
281 data should be copied using these routines. Both return ``-EFAULT`` or
284 :c:func:`copy_to_user()` and :c:func:`copy_from_user()` are
290 Unlike :c:func:`put_user()` and :c:func:`get_user()`, they
294 every year or so. --RR.]
297 user context (it makes no sense), with interrupts disabled, or a
300 :c:func:`kmalloc()`/:c:func:`kfree()`
301 -------------------------------------
307 These routines are used to dynamically request pointer-aligned chunks of
309 :c:func:`kmalloc()` takes an extra flag word. Important values:
317 from interrupt context. You should **really** have a good
318 out-of-memory error-handling strategy.
324 If you see a sleeping function called from invalid context warning
325 message, then maybe you called a sleeping allocation function from
330 ``asm/page_types.h``) bytes, consider using :c:func:`__get_free_pages()`
335 If you are allocating more than a page worth of bytes you can use
336 :c:func:`vmalloc()`. It'll allocate virtual memory in the kernel
340 contiguous memory for some weird device, you have a problem: it is
342 in a running kernel makes it hard. The best way is to allocate the block
343 early in the boot process via the :c:func:`alloc_bootmem()`
346 Before inventing your own cache of often-used objects consider using a
349 :c:macro:`current`
350 ------------------
354 This global variable (really a macro) contains a pointer to the current
355 task structure, so is only valid in user context. For example, when a
356 process makes a system call, this will point to the task structure of
359 :c:func:`mdelay()`/:c:func:`udelay()`
360 -------------------------------------
364 The :c:func:`udelay()` and :c:func:`ndelay()` functions can be
366 overflow - the helper function :c:func:`mdelay()` is useful here, or
367 consider :c:func:`msleep()`.
369 :c:func:`cpu_to_be32()`/:c:func:`be32_to_cpu()`/:c:func:`cpu_to_le32()`/:c:func:`le32_to_cpu()`
370 -----------------------------------------------------------------------------------------------
374 The :c:func:`cpu_to_be32()` family (where the "32" can be replaced
378 :c:func:`be32_to_cpu()`, etc.
381 variation, such as :c:func:`cpu_to_be32p()`, which take a pointer
383 is the "in-situ" family, such as :c:func:`cpu_to_be32s()`, which
386 :c:func:`local_irq_save()`/:c:func:`local_irq_restore()`
387 --------------------------------------------------------
394 enabled, you can simply use :c:func:`local_irq_disable()` and
395 :c:func:`local_irq_enable()`.
399 :c:func:`local_bh_disable()`/:c:func:`local_bh_enable()`
400 --------------------------------------------------------
410 :c:func:`smp_processor_id()`
411 ----------------------------
415 :c:func:`get_cpu()` disables preemption (so you won't suddenly get
418 continuous. You return it again with :c:func:`put_cpu()` when you
426 ------------------------------------
430 After boot, the kernel frees up a special section; functions marked with
433 initialization. ``__exit`` is used to declare a function which is only
435 compiled as a module. See the header file for use. Note that it makes no
436 sense for a function marked with ``__init`` to be exported to modules
437 with :c:func:`EXPORT_SYMBOL()` or :c:func:`EXPORT_SYMBOL_GPL()`- this
440 :c:func:`__initcall()`/:c:func:`module_init()`
441 ----------------------------------------------
445 Many parts of the kernel are well served as a module
446 (dynamically-loadable parts of the kernel). Using the
447 :c:func:`module_init()` and :c:func:`module_exit()` macros it
448 is easy to write code without #ifdefs which can operate both as a module
451 The :c:func:`module_init()` macro defines which function is to be
452 called at module insertion time (if the file is compiled as a module),
453 or at boot time: if the file is not compiled as a module the
454 :c:func:`module_init()` macro becomes equivalent to
455 :c:func:`__initcall()`, which through linker magic ensures that
458 The function can return a negative error number to cause module loading
463 :c:func:`module_exit()`
464 -----------------------
476 not be removable (except for 'rmmod -f').
478 :c:func:`try_module_get()`/:c:func:`module_put()`
479 -------------------------------------------------
483 These manipulate the module usage count, to protect against removal (a
486 :c:func:`try_module_get()` on that module: if it fails, then the
489 :c:func:`module_put()` when you're finished.
492 :c:type:`struct file_operations <file_operations>` structure.
500 A wait queue is used to wait for someone to wake you up when a certain
502 race condition. You declare a :c:type:`wait_queue_head_t`, and then processes
503 which want to wait for that condition declare a :c:type:`wait_queue_entry_t`
507 ---------
509 You declare a ``wait_queue_head_t`` using the
510 :c:func:`DECLARE_WAIT_QUEUE_HEAD()` macro, or using the
511 :c:func:`init_waitqueue_head()` routine in your initialization
515 -------
518 put yourself in the queue before checking the condition. There is a
519 macro to do this: :c:func:`wait_event_interruptible()`
522 this expression is true, or ``-ERESTARTSYS`` if a signal is received. The
523 :c:func:`wait_event()` version ignores signals.
526 ----------------------
528 Call :c:func:`wake_up()` (``include/linux/wait.h``), which will wake
538 class of operations work on :c:type:`atomic_t` (``include/asm/atomic.h``);
539 this contains a signed integer (at least 32 bits long), and you must use
540 these functions to manipulate or read :c:type:`atomic_t` variables.
541 :c:func:`atomic_read()` and :c:func:`atomic_set()` get and set
542 the counter, :c:func:`atomic_add()`, :c:func:`atomic_sub()`,
543 :c:func:`atomic_inc()`, :c:func:`atomic_dec()`, and
544 :c:func:`atomic_dec_and_test()` (returns true if it was
554 operations generally take a pointer to the bit pattern, and a bit
555 number: 0 is the least significant bit. :c:func:`set_bit()`,
556 :c:func:`clear_bit()` and :c:func:`change_bit()` set, clear,
557 and flip the given bit. :c:func:`test_and_set_bit()`,
558 :c:func:`test_and_clear_bit()` and
559 :c:func:`test_and_change_bit()` do the same thing, except return
564 ``BITS_PER_LONG``. The resulting behavior is strange on big-endian
565 platforms though so it is a good idea not to do this.
570 Within the kernel proper, the normal linking rules apply (ie. unless a
572 be used anywhere in the kernel). However, for modules, a special
576 :c:func:`EXPORT_SYMBOL()`
577 -------------------------
581 This is the classic method of exporting a symbol: dynamically loaded
584 :c:func:`EXPORT_SYMBOL_GPL()`
585 -----------------------------
589 Similar to :c:func:`EXPORT_SYMBOL()` except that the symbols
590 exported by :c:func:`EXPORT_SYMBOL_GPL()` can only be seen by
591 modules with a :c:func:`MODULE_LICENSE()` that specifies a GPL
597 :c:func:`EXPORT_SYMBOL_NS()`
598 ----------------------------
602 This is the variant of `EXPORT_SYMBOL()` that allows specifying a symbol
604 Documentation/core-api/symbol-namespaces.rst
606 :c:func:`EXPORT_SYMBOL_NS_GPL()`
607 --------------------------------
611 This is the variant of `EXPORT_SYMBOL_GPL()` that allows specifying a symbol
613 Documentation/core-api/symbol-namespaces.rst
618 Double-linked lists ``include/linux/list.h``
619 --------------------------------------------
621 There used to be three sets of linked-list routines in the kernel
623 pressing need for a single list, it's a good choice.
625 In particular, :c:func:`list_for_each_entry()` is useful.
628 ------------------
630 For code called in user context, it's very common to defy C convention,
631 and return 0 for success, and a negative error number (eg. ``-EFAULT``) for
635 Using :c:func:`ERR_PTR()` (``include/linux/err.h``) to encode a
636 negative error number into a pointer, and :c:func:`IS_ERR()` and
637 :c:func:`PTR_ERR()` to get it back out again: avoids a separate
638 pointer parameter for the error number. Icky, but in a good way.
641 --------------------
645 their toes: it reflects a fundamental change (eg. can no longer be
647 which were caught before). Usually this is accompanied by a fairly
648 complete note to the linux-kernel mailing list; search the archive.
649 Simply doing a global replace on the file usually makes things **worse**.
652 ------------------------------
669 --------------
674 info page section "C Extensions" for more details - Yes, really the info
675 page, the man page is only a short summary of the stuff in info).
677 - Inline functions
679 - Statement expressions (ie. the ({ and }) constructs).
681 - Declaring attributes of a function / variable / type
684 - typeof
686 - Zero length arrays
688 - Macro varargs
690 - Arithmetic on void pointers
692 - Non-Constant initializers
694 - Assembler Instructions (not outside arch/ and include/asm/)
696 - Function names as strings (__func__).
698 - __builtin_constant_p()
705 C++
706 ---
708 Using C++ in the kernel is usually a bad idea, because the kernel does
714 ---
717 the top of .c files) to abstract away functions rather than using \`#if'
718 pre-processor statements throughout the source code.
724 make a neat patch, there's administrative work to be done:
726 - Figure out whose pond you've been pissing in. Look at the top of the
734 will look when they find a bug, or when **they** want to make a change.
736 - Usually you want a configuration option for your kernel hack. Edit
739 ``Documentation/kbuild/kconfig-language.rst``.
747 - Edit the ``Makefile``: the CONFIG variables are exported here so you
748 can usually just add a "obj-$(CONFIG_xxx) += xxx.o" line. The syntax
751 - Put yourself in ``CREDITS`` if you've done something noteworthy,
752 usually beyond a single file (your name should be at the top of the
754 when changes are made to a subsystem, and hear about bugs; it implies
755 a more-than-passing commitment to some part of the code.
757 - Finally, don't forget to read
758 ``Documentation/process/submitting-patches.rst`` and possibly
759 ``Documentation/process/submitting-drivers.rst``.
776 * Kernel pointers have redundant information, so we can use a
777 * scheme where we can return either an error code or a dentry
780 * This should be a per-architecture thing, to allow different
785 #define IS_ERR(ptr) ((unsigned long)(ptr) > (unsigned long)(-1000))
799 * At least we *know* we can't spell, and use a spell-checker.
808 /* Tested on SS-5, SS-10. Probably someone at Sun applied a spell-checker. */
827 clarity fixes, and some excellent non-obvious points. Werner Almesberger
828 for giving me a great summary of :c:func:`disable_irq()`, and Jes