1.. _coccinelle:
2
3..
4   Copyright 2010 Nicolas Palix <npalix@diku.dk>
5   Copyright 2010 Julia Lawall <julia.lawall@lip6.fr>
6   Copyright 2010 Gilles Muller <Gilles.Muller@lip6.fr>
7
8Coccinelle
9##########
10
11Coccinelle is a tool for pattern matching and text transformation that has
12many uses in kernel development, including the application of complex,
13tree-wide patches and detection of problematic programming patterns.
14
15.. note::
16   Linux and macOS development environments are supported, but not Windows.
17
18Getting Coccinelle
19******************
20
21The semantic patches included in the kernel use features and options
22which are provided by Coccinelle version 1.0.0-rc11 and above.
23Using earlier versions will fail as the option names used by
24the Coccinelle files and ``coccicheck`` have been updated.
25
26Coccinelle is available through the package manager
27of many distributions, e.g. :
28
29.. rst-class:: rst-columns
30
31   * Debian
32   * Fedora
33   * Ubuntu
34   * OpenSUSE
35   * Arch Linux
36   * NetBSD
37   * FreeBSD
38
39Some distribution packages are obsolete and it is recommended
40to use the latest version released from the Coccinelle homepage at
41http://coccinelle.lip6.fr/
42
43Or from Github at:
44
45https://github.com/coccinelle/coccinelle
46
47Once you have it, run the following commands:
48
49.. code-block:: console
50
51   ./autogen
52   ./configure
53   make
54
55as a regular user, and install it with:
56
57.. code-block:: console
58
59   sudo make install
60
61More detailed installation instructions to build from source can be
62found at:
63
64https://github.com/coccinelle/coccinelle/blob/master/install.txt
65
66Supplemental documentation
67**************************
68
69For Semantic Patch Language(SmPL) grammar documentation refer to:
70
71http://coccinelle.lip6.fr/documentation.php
72
73Using Coccinelle on Zephyr
74**************************
75
76``coccicheck`` checker is the front-end to the Coccinelle infrastructure
77and has various modes:
78
79Four basic modes are defined: ``patch``, ``report``, ``context``, and
80``org``. The mode to use is specified by setting ``--mode=<mode>`` or
81``-m=<mode>``.
82
83* ``patch`` proposes a fix, when possible.
84
85* ``report`` generates a list in the following format:
86  file:line:column-column: message
87
88* ``context`` highlights lines of interest and their context in a
89  diff-like style.Lines of interest are indicated with ``-``.
90
91* ``org`` generates a report in the Org mode format of Emacs.
92
93Note that not all semantic patches implement all modes. For easy use
94of Coccinelle, the default mode is ``report``.
95
96Two other modes provide some common combinations of these modes.
97
98- ``chain`` tries the previous modes in the order above until one succeeds.
99
100- ``rep+ctxt`` runs successively the report mode and the context mode.
101  It should be used with the C option (described later)
102  which checks the code on a file basis.
103
104Examples
105********
106
107To make a report for every semantic patch, run the following command:
108
109.. code-block:: console
110
111   ./scripts/coccicheck --mode=report
112
113To produce patches, run:
114
115.. code-block:: console
116
117   ./scripts/coccicheck --mode=patch
118
119The ``coccicheck`` target applies every semantic patch available in the
120sub-directories of ``scripts/coccinelle`` to the entire source code tree.
121
122For each semantic patch, a commit message is proposed.  It gives a
123description of the problem being checked by the semantic patch, and
124includes a reference to Coccinelle.
125
126As any static code analyzer, Coccinelle produces false
127positives. Thus, reports must be carefully checked, and patches reviewed.
128
129To enable verbose messages set ``--verbose=1`` option, for example:
130
131.. code-block:: console
132
133   ./scripts/coccicheck --mode=report --verbose=1
134
135Coccinelle parallelization
136**************************
137
138By default, ``coccicheck`` tries to run as parallel as possible. To change
139the parallelism, set the ``--jobs=<number>`` option. For example, to run
140across 4 CPUs:
141
142.. code-block:: console
143
144   ./scripts/coccicheck --mode=report --jobs=4
145
146As of Coccinelle 1.0.2 Coccinelle uses Ocaml parmap for parallelization,
147if support for this is detected you will benefit from parmap parallelization.
148
149When parmap is enabled ``coccicheck`` will enable dynamic load balancing by using
150``--chunksize 1`` argument, this ensures we keep feeding threads with work
151one by one, so that we avoid the situation where most work gets done by only
152a few threads. With dynamic load balancing, if a thread finishes early we keep
153feeding it more work.
154
155When parmap is enabled, if an error occurs in Coccinelle, this error
156value is propagated back, the return value of the ``coccicheck``
157command captures this return value.
158
159Using Coccinelle with a single semantic patch
160*********************************************
161
162The option ``--cocci`` can be used to check a single
163semantic patch. In that case, the variable must be initialized with
164the name of the semantic patch to apply.
165
166For instance:
167
168.. code-block:: console
169
170   ./scripts/coccicheck --mode=report --cocci=<example.cocci>
171
172or:
173
174.. code-block:: console
175
176   ./scripts/coccicheck --mode=report --cocci=./path/to/<example.cocci>
177
178
179Controlling which files are processed by Coccinelle
180***************************************************
181
182By default the entire source tree is checked.
183
184To apply Coccinelle to a specific directory, pass the path of specific
185directory as an argument.
186
187For example, to check ``drivers/usb/`` one may write:
188
189.. code-block:: console
190
191   ./scripts/coccicheck --mode=patch drivers/usb/
192
193The ``report`` mode is the default. You can select another one with the
194``--mode=<mode>`` option explained above.
195
196Debugging Coccinelle SmPL patches
197*********************************
198
199Using ``coccicheck`` is best as it provides in the spatch command line
200include options matching the options used when we compile the kernel.
201You can learn what these options are by using verbose option, you could
202then manually run Coccinelle with debug options added.
203
204Alternatively you can debug running Coccinelle against SmPL patches
205by asking for stderr to be redirected to stderr, by default stderr
206is redirected to /dev/null, if you'd like to capture stderr you
207can specify the ``--debug=file.err`` option to ``coccicheck``. For
208instance:
209
210.. code-block:: console
211
212   rm -f cocci.err
213   ./scripts/coccicheck --mode=patch --debug=cocci.err
214   cat cocci.err
215
216Debugging support is only supported when using Coccinelle >= 1.0.2.
217
218Additional Flags
219****************
220
221Additional flags can be passed to spatch through the SPFLAGS
222variable. This works as Coccinelle respects the last flags
223given to it when options are in conflict.
224
225.. code-block:: console
226
227   ./scripts/coccicheck --sp-flag="--use-glimpse"
228
229Coccinelle supports idutils as well but requires coccinelle >= 1.0.6.
230When no ID file is specified coccinelle assumes your ID database file
231is in the file .id-utils.index on the top level of the kernel, coccinelle
232carries a script scripts/idutils_index.sh which creates the database with:
233
234.. code-block:: console
235
236   mkid -i C --output .id-utils.index
237
238If you have another database filename you can also just symlink with this
239name.
240
241.. code-block:: console
242
243   ./scripts/coccicheck --sp-flag="--use-idutils"
244
245Alternatively you can specify the database filename explicitly, for
246instance:
247
248.. code-block:: console
249
250   ./scripts/coccicheck --sp-flag="--use-idutils /full-path/to/ID"
251
252Sometimes coccinelle doesn't recognize or parse complex macro variables
253due to insufficient definition. Therefore, to make it parsable we
254explicitly provide the prototype of the complex macro using the
255``---macro-file-builtins <headerfile.h>`` flag.
256
257The ``<headerfile.h>`` should contain the complete prototype of
258the complex macro from which spatch engine can extract the type
259information required during transformation.
260
261For example:
262
263``Z_SYSCALL_HANDLER`` is not recognized by coccinelle. Therefore, we
264put its prototype in a header file, say for example ``mymacros.h``.
265
266.. code-block:: console
267
268   $ cat mymacros.h
269   #define Z_SYSCALL_HANDLER int xxx
270
271Now we pass the header file ``mymacros.h`` during transformation:
272
273.. code-block:: console
274
275   ./scripts/coccicheck --sp-flag="---macro-file-builtins mymacros.h"
276
277See ``spatch --help`` to learn more about spatch options.
278
279Note that the ``--use-glimpse`` and ``--use-idutils`` options
280require external tools for indexing the code. None of them is
281thus active by default. However, by indexing the code with
282one of these tools, and according to the cocci file used,
283spatch could proceed the entire code base more quickly.
284
285
286SmPL patch specific options
287***************************
288
289SmPL patches can have their own requirements for options passed
290to Coccinelle. SmPL patch specific options can be provided by
291providing them at the top of the SmPL patch, for instance:
292
293.. code-block:: console
294
295   // Options: --no-includes --include-headers
296
297Proposing new semantic patches
298******************************
299
300New semantic patches can be proposed and submitted by kernel
301developers. For sake of clarity, they should be organized in the
302sub-directories of ``scripts/coccinelle/``.
303
304The cocci script should have the following properties:
305
306* The script **must** have ``report`` mode.
307
308* The first few lines should state the purpose of the script
309  using ``///`` comments . Usually, this message would be used as the
310  commit log when proposing a patch based on the script.
311
312Example
313=======
314
315.. code-block:: console
316
317   /// Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element
318
319* A more detailed information about the script with exceptional cases
320  or false positives (if any) can be listed using ``//#`` comments.
321
322Example
323=======
324
325.. code-block:: console
326
327   //# This makes an effort to find cases where ARRAY_SIZE can be used such as
328   //# where there is a division of sizeof the array by the sizeof its first
329   //# element or by any indexed element or the element type. It replaces the
330   //# division of the two sizeofs by ARRAY_SIZE.
331
332* Confidence: It is a property defined to specify the accuracy level of
333  the script. It can be either ``High``, ``Moderate`` or ``Low`` depending
334  upon the number of false positives observed.
335
336Example
337=======
338
339.. code-block:: console
340
341   // Confidence: High
342
343* Virtual rules: These are required to support the various modes framed
344  in the script. The virtual rule specified in the script should have
345  the corresponding mode handling rule.
346
347Example
348=======
349
350.. code-block:: console
351
352   virtual context
353
354   @depends on context@
355   type T;
356   T[] E;
357   @@
358   (
359   * (sizeof(E)/sizeof(*E))
360   |
361   * (sizeof(E)/sizeof(E[...]))
362   |
363   * (sizeof(E)/sizeof(T))
364   )
365
366Detailed description of the ``report`` mode
367*******************************************
368
369``report`` generates a list in the following format:
370
371.. code-block:: console
372
373   file:line:column-column: message
374
375Example
376=======
377
378Running:
379
380.. code-block:: console
381
382   ./scripts/coccicheck --mode=report --cocci=scripts/coccinelle/array_size.cocci
383
384will execute the following part of the SmPL script:
385
386.. code-block:: console
387
388   <smpl>
389
390   @r depends on (org || report)@
391   type T;
392   T[] E;
393   position p;
394   @@
395   (
396   (sizeof(E)@p /sizeof(*E))
397   |
398   (sizeof(E)@p /sizeof(E[...]))
399   |
400   (sizeof(E)@p /sizeof(T))
401   )
402
403   @script:python depends on report@
404   p << r.p;
405   @@
406
407   msg="WARNING: Use ARRAY_SIZE"
408   coccilib.report.print_report(p[0], msg)
409
410   </smpl>
411
412This SmPL excerpt generates entries on the standard output, as
413illustrated below:
414
415.. code-block:: console
416
417   ext/hal/nxp/mcux/drivers/lpc/fsl_wwdt.c:66:49-50: WARNING: Use ARRAY_SIZE
418   ext/hal/nxp/mcux/drivers/lpc/fsl_ctimer.c:74:53-54: WARNING: Use ARRAY_SIZE
419   ext/hal/nxp/mcux/drivers/imx/fsl_dcp.c:944:45-46: WARNING: Use ARRAY_SIZE
420
421
422Detailed description of the ``patch`` mode
423******************************************
424
425When the ``patch`` mode is available, it proposes a fix for each problem
426identified.
427
428Example
429=======
430
431Running:
432
433.. code-block:: console
434
435   ./scripts/coccicheck --mode=patch --cocci=scripts/coccinelle/misc/array_size.cocci
436
437will execute the following part of the SmPL script:
438
439.. code-block:: console
440
441   <smpl>
442
443   @depends on patch@
444   type T;
445   T[] E;
446   @@
447   (
448   - (sizeof(E)/sizeof(*E))
449   + ARRAY_SIZE(E)
450   |
451   - (sizeof(E)/sizeof(E[...]))
452   + ARRAY_SIZE(E)
453   |
454   - (sizeof(E)/sizeof(T))
455   + ARRAY_SIZE(E)
456   )
457
458   </smpl>
459
460This SmPL excerpt generates patch hunks on the standard output, as
461illustrated below:
462
463.. code-block:: console
464
465   diff -u -p a/ext/lib/encoding/tinycbor/src/cborvalidation.c b/ext/lib/encoding/tinycbor/src/cborvalidation.c
466   --- a/ext/lib/encoding/tinycbor/src/cborvalidation.c
467   +++ b/ext/lib/encoding/tinycbor/src/cborvalidation.c
468   @@ -325,7 +325,7 @@ static inline CborError validate_number(
469   static inline CborError validate_tag(CborValue *it, CborTag tag, int flags, int recursionLeft)
470   {
471     CborType type = cbor_value_get_type(it);
472   -    const size_t knownTagCount = sizeof(knownTagData) / sizeof(knownTagData[0]);
473   +    const size_t knownTagCount = ARRAY_SIZE(knownTagData);
474      const struct KnownTagData *tagData = knownTagData;
475      const struct KnownTagData * const knownTagDataEnd = knownTagData + knownTagCount;
476
477Detailed description of the ``context`` mode
478********************************************
479
480``context`` highlights lines of interest and their context
481in a diff-like style.
482
483.. note::
484 The diff-like output generated is NOT an applicable patch. The
485 intent of the ``context`` mode is to highlight the important lines
486 (annotated with minus, ``-``) and gives some surrounding context
487 lines around. This output can be used with the diff mode of
488 Emacs to review the code.
489
490Example
491=======
492
493Running:
494
495.. code-block:: console
496
497   ./scripts/coccicheck --mode=context --cocci=scripts/coccinelle/array_size.cocci
498
499will execute the following part of the SmPL script:
500
501.. code-block:: console
502
503   <smpl>
504
505   @depends on context@
506   type T;
507   T[] E;
508   @@
509   (
510   * (sizeof(E)/sizeof(*E))
511   |
512   * (sizeof(E)/sizeof(E[...]))
513   |
514   * (sizeof(E)/sizeof(T))
515   )
516
517   </smpl>
518
519This SmPL excerpt generates diff hunks on the standard output, as
520illustrated below:
521
522.. code-block:: console
523
524   diff -u -p ext/lib/encoding/tinycbor/src/cborvalidation.c /tmp/nothing/ext/lib/encoding/tinycbor/src/cborvalidation.c
525   --- ext/lib/encoding/tinycbor/src/cborvalidation.c
526   +++ /tmp/nothing/ext/lib/encoding/tinycbor/src/cborvalidation.c
527   @@ -325,7 +325,6 @@ static inline CborError validate_number(
528   static inline CborError validate_tag(CborValue *it, CborTag tag, int flags, int recursionLeft)
529   {
530     CborType type = cbor_value_get_type(it);
531   -    const size_t knownTagCount = sizeof(knownTagData) / sizeof(knownTagData[0]);
532      const struct KnownTagData *tagData = knownTagData;
533      const struct KnownTagData * const knownTagDataEnd = knownTagData + knownTagCount;
534
535Detailed description of the ``org`` mode
536****************************************
537
538``org`` generates a report in the Org mode format of Emacs.
539
540Example
541=======
542
543Running:
544
545.. code-block:: console
546
547   ./scripts/coccicheck --mode=org --cocci=scripts/coccinelle/misc/array_size.cocci
548
549will execute the following part of the SmPL script:
550
551.. code-block:: console
552
553   <smpl>
554
555   @r depends on (org || report)@
556   type T;
557   T[] E;
558   position p;
559   @@
560   (
561   (sizeof(E)@p /sizeof(*E))
562   |
563   (sizeof(E)@p /sizeof(E[...]))
564   |
565   (sizeof(E)@p /sizeof(T))
566   )
567
568   @script:python depends on org@
569   p << r.p;
570   @@
571   coccilib.org.print_todo(p[0], "WARNING should use ARRAY_SIZE")
572
573   </smpl>
574
575This SmPL excerpt generates Org entries on the standard output, as
576illustrated below:
577
578.. code-block:: console
579
580   * TODO [[view:ext/lib/encoding/tinycbor/src/cborvalidation.c::face=ovl-face1::linb=328::colb=52::cole=53][WARNING should use ARRAY_SIZE]]
581
582Coccinelle Mailing List
583***********************
584
585Subscribe to the coccinelle mailing list:
586
587* https://systeme.lip6.fr/mailman/listinfo/cocci
588
589Archives:
590
591* https://lore.kernel.org/cocci/
592* https://systeme.lip6.fr/pipermail/cocci/
593