| 1 |
|
|---|
| 2 |
Linux kernel coding style |
|---|
| 3 |
|
|---|
| 4 |
This is a short document describing the preferred coding style for the |
|---|
| 5 |
linux kernel. Coding style is very personal, and I won't _force_ my |
|---|
| 6 |
views on anybody, but this is what goes for anything that I have to be |
|---|
| 7 |
able to maintain, and I'd prefer it for most other things too. Please |
|---|
| 8 |
at least consider the points made here. |
|---|
| 9 |
|
|---|
| 10 |
First off, I'd suggest printing out a copy of the GNU coding standards, |
|---|
| 11 |
and NOT read it. Burn them, it's a great symbolic gesture. |
|---|
| 12 |
|
|---|
| 13 |
Anyway, here goes: |
|---|
| 14 |
|
|---|
| 15 |
|
|---|
| 16 |
Chapter 1: Indentation |
|---|
| 17 |
|
|---|
| 18 |
Tabs are 8 characters, and thus indentations are also 8 characters. |
|---|
| 19 |
There are heretic movements that try to make indentations 4 (or even 2!) |
|---|
| 20 |
characters deep, and that is akin to trying to define the value of PI to |
|---|
| 21 |
be 3. |
|---|
| 22 |
|
|---|
| 23 |
Rationale: The whole idea behind indentation is to clearly define where |
|---|
| 24 |
a block of control starts and ends. Especially when you've been looking |
|---|
| 25 |
at your screen for 20 straight hours, you'll find it a lot easier to see |
|---|
| 26 |
how the indentation works if you have large indentations. |
|---|
| 27 |
|
|---|
| 28 |
Now, some people will claim that having 8-character indentations makes |
|---|
| 29 |
the code move too far to the right, and makes it hard to read on a |
|---|
| 30 |
80-character terminal screen. The answer to that is that if you need |
|---|
| 31 |
more than 3 levels of indentation, you're screwed anyway, and should fix |
|---|
| 32 |
your program. |
|---|
| 33 |
|
|---|
| 34 |
In short, 8-char indents make things easier to read, and have the added |
|---|
| 35 |
benefit of warning you when you're nesting your functions too deep. |
|---|
| 36 |
Heed that warning. |
|---|
| 37 |
|
|---|
| 38 |
Don't put multiple statements on a single line unless you have |
|---|
| 39 |
something to hide: |
|---|
| 40 |
|
|---|
| 41 |
if (condition) do_this; |
|---|
| 42 |
do_something_everytime; |
|---|
| 43 |
|
|---|
| 44 |
Outside of comments, documentation and except in Kconfig, spaces are never |
|---|
| 45 |
used for indentation, and the above example is deliberately broken. |
|---|
| 46 |
|
|---|
| 47 |
Get a decent editor and don't leave whitespace at the end of lines. |
|---|
| 48 |
|
|---|
| 49 |
|
|---|
| 50 |
Chapter 2: Breaking long lines and strings |
|---|
| 51 |
|
|---|
| 52 |
Coding style is all about readability and maintainability using commonly |
|---|
| 53 |
available tools. |
|---|
| 54 |
|
|---|
| 55 |
The limit on the length of lines is 80 columns and this is a hard limit. |
|---|
| 56 |
|
|---|
| 57 |
Statements longer than 80 columns will be broken into sensible chunks. |
|---|
| 58 |
Descendants are always substantially shorter than the parent and are placed |
|---|
| 59 |
substantially to the right. The same applies to function headers with a long |
|---|
| 60 |
argument list. Long strings are as well broken into shorter strings. |
|---|
| 61 |
|
|---|
| 62 |
void fun(int a, int b, int c) |
|---|
| 63 |
{ |
|---|
| 64 |
if (condition) |
|---|
| 65 |
printk(KERN_WARNING "Warning this is a long printk with " |
|---|
| 66 |
"3 parameters a: %u b: %u " |
|---|
| 67 |
"c: %u \n", a, b, c); |
|---|
| 68 |
else |
|---|
| 69 |
next_statement; |
|---|
| 70 |
} |
|---|
| 71 |
|
|---|
| 72 |
Chapter 3: Placing Braces |
|---|
| 73 |
|
|---|
| 74 |
The other issue that always comes up in C styling is the placement of |
|---|
| 75 |
braces. Unlike the indent size, there are few technical reasons to |
|---|
| 76 |
choose one placement strategy over the other, but the preferred way, as |
|---|
| 77 |
shown to us by the prophets Kernighan and Ritchie, is to put the opening |
|---|
| 78 |
brace last on the line, and put the closing brace first, thusly: |
|---|
| 79 |
|
|---|
| 80 |
if (x is true) { |
|---|
| 81 |
we do y |
|---|
| 82 |
} |
|---|
| 83 |
|
|---|
| 84 |
However, there is one special case, namely functions: they have the |
|---|
| 85 |
opening brace at the beginning of the next line, thus: |
|---|
| 86 |
|
|---|
| 87 |
int function(int x) |
|---|
| 88 |
{ |
|---|
| 89 |
body of function |
|---|
| 90 |
} |
|---|
| 91 |
|
|---|
| 92 |
Heretic people all over the world have claimed that this inconsistency |
|---|
| 93 |
is ... well ... inconsistent, but all right-thinking people know that |
|---|
| 94 |
(a) K&R are _right_ and (b) K&R are right. Besides, functions are |
|---|
| 95 |
special anyway (you can't nest them in C). |
|---|
| 96 |
|
|---|
| 97 |
Note that the closing brace is empty on a line of its own, _except_ in |
|---|
| 98 |
the cases where it is followed by a continuation of the same statement, |
|---|
| 99 |
ie a "while" in a do-statement or an "else" in an if-statement, like |
|---|
| 100 |
this: |
|---|
| 101 |
|
|---|
| 102 |
do { |
|---|
| 103 |
body of do-loop |
|---|
| 104 |
} while (condition); |
|---|
| 105 |
|
|---|
| 106 |
and |
|---|
| 107 |
|
|---|
| 108 |
if (x == y) { |
|---|
| 109 |
.. |
|---|
| 110 |
} else if (x > y) { |
|---|
| 111 |
... |
|---|
| 112 |
} else { |
|---|
| 113 |
.... |
|---|
| 114 |
} |
|---|
| 115 |
|
|---|
| 116 |
Rationale: K&R. |
|---|
| 117 |
|
|---|
| 118 |
Also, note that this brace-placement also minimizes the number of empty |
|---|
| 119 |
(or almost empty) lines, without any loss of readability. Thus, as the |
|---|
| 120 |
supply of new-lines on your screen is not a renewable resource (think |
|---|
| 121 |
25-line terminal screens here), you have more empty lines to put |
|---|
| 122 |
comments on. |
|---|
| 123 |
|
|---|
| 124 |
|
|---|
| 125 |
Chapter 4: Naming |
|---|
| 126 |
|
|---|
| 127 |
C is a Spartan language, and so should your naming be. Unlike Modula-2 |
|---|
| 128 |
and Pascal programmers, C programmers do not use cute names like |
|---|
| 129 |
ThisVariableIsATemporaryCounter. A C programmer would call that |
|---|
| 130 |
variable "tmp", which is much easier to write, and not the least more |
|---|
| 131 |
difficult to understand. |
|---|
| 132 |
|
|---|
| 133 |
HOWEVER, while mixed-case names are frowned upon, descriptive names for |
|---|
| 134 |
global variables are a must. To call a global function "foo" is a |
|---|
| 135 |
shooting offense. |
|---|
| 136 |
|
|---|
| 137 |
GLOBAL variables (to be used only if you _really_ need them) need to |
|---|
| 138 |
have descriptive names, as do global functions. If you have a function |
|---|
| 139 |
that counts the number of active users, you should call that |
|---|
| 140 |
"count_active_users()" or similar, you should _not_ call it "cntusr()". |
|---|
| 141 |
|
|---|
| 142 |
Encoding the type of a function into the name (so-called Hungarian |
|---|
| 143 |
notation) is brain damaged - the compiler knows the types anyway and can |
|---|
| 144 |
check those, and it only confuses the programmer. No wonder MicroSoft |
|---|
| 145 |
makes buggy programs. |
|---|
| 146 |
|
|---|
| 147 |
LOCAL variable names should be short, and to the point. If you have |
|---|
| 148 |
some random integer loop counter, it should probably be called "i". |
|---|
| 149 |
Calling it "loop_counter" is non-productive, if there is no chance of it |
|---|
| 150 |
being mis-understood. Similarly, "tmp" can be just about any type of |
|---|
| 151 |
variable that is used to hold a temporary value. |
|---|
| 152 |
|
|---|
| 153 |
If you are afraid to mix up your local variable names, you have another |
|---|
| 154 |
problem, which is called the function-growth-hormone-imbalance syndrome. |
|---|
| 155 |
See next chapter. |
|---|
| 156 |
|
|---|
| 157 |
|
|---|
| 158 |
Chapter 5: Functions |
|---|
| 159 |
|
|---|
| 160 |
Functions should be short and sweet, and do just one thing. They should |
|---|
| 161 |
fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, |
|---|
| 162 |
as we all know), and do one thing and do that well. |
|---|
| 163 |
|
|---|
| 164 |
The maximum length of a function is inversely proportional to the |
|---|
| 165 |
complexity and indentation level of that function. So, if you have a |
|---|
| 166 |
conceptually simple function that is just one long (but simple) |
|---|
| 167 |
case-statement, where you have to do lots of small things for a lot of |
|---|
| 168 |
different cases, it's OK to have a longer function. |
|---|
| 169 |
|
|---|
| 170 |
However, if you have a complex function, and you suspect that a |
|---|
| 171 |
less-than-gifted first-year high-school student might not even |
|---|
| 172 |
understand what the function is all about, you should adhere to the |
|---|
| 173 |
maximum limits all the more closely. Use helper functions with |
|---|
| 174 |
descriptive names (you can ask the compiler to in-line them if you think |
|---|
| 175 |
it's performance-critical, and it will probably do a better job of it |
|---|
| 176 |
than you would have done). |
|---|
| 177 |
|
|---|
| 178 |
Another measure of the function is the number of local variables. They |
|---|
| 179 |
shouldn't exceed 5-10, or you're doing something wrong. Re-think the |
|---|
| 180 |
function, and split it into smaller pieces. A human brain can |
|---|
| 181 |
generally easily keep track of about 7 different things, anything more |
|---|
| 182 |
and it gets confused. You know you're brilliant, but maybe you'd like |
|---|
| 183 |
to understand what you did 2 weeks from now. |
|---|
| 184 |
|
|---|
| 185 |
|
|---|
| 186 |
Chapter 6: Centralized exiting of functions |
|---|
| 187 |
|
|---|
| 188 |
Albeit deprecated by some people, the equivalent of the goto statement is |
|---|
| 189 |
used frequently by compilers in form of the unconditional jump instruction. |
|---|
| 190 |
|
|---|
| 191 |
The goto statement comes in handy when a function exits from multiple |
|---|
| 192 |
locations and some common work such as cleanup has to be done. |
|---|
| 193 |
|
|---|
| 194 |
The rationale is: |
|---|
| 195 |
|
|---|
| 196 |
- unconditional statements are easier to understand and follow |
|---|
| 197 |
- nesting is reduced |
|---|
| 198 |
- errors by not updating individual exit points when making |
|---|
| 199 |
modifications are prevented |
|---|
| 200 |
- saves the compiler work to optimize redundant code away ;) |
|---|
| 201 |
|
|---|
| 202 |
int fun(int ) |
|---|
| 203 |
{ |
|---|
| 204 |
int result = 0; |
|---|
| 205 |
char *buffer = kmalloc(SIZE); |
|---|
| 206 |
|
|---|
| 207 |
if (buffer == NULL) |
|---|
| 208 |
return -ENOMEM; |
|---|
| 209 |
|
|---|
| 210 |
if (condition1) { |
|---|
| 211 |
while (loop1) { |
|---|
| 212 |
... |
|---|
| 213 |
} |
|---|
| 214 |
result = 1; |
|---|
| 215 |
goto out; |
|---|
| 216 |
} |
|---|
| 217 |
... |
|---|
| 218 |
out: |
|---|
| 219 |
kfree(buffer); |
|---|
| 220 |
return result; |
|---|
| 221 |
} |
|---|
| 222 |
|
|---|
| 223 |
Chapter 7: Commenting |
|---|
| 224 |
|
|---|
| 225 |
Comments are good, but there is also a danger of over-commenting. NEVER |
|---|
| 226 |
try to explain HOW your code works in a comment: it's much better to |
|---|
| 227 |
write the code so that the _working_ is obvious, and it's a waste of |
|---|
| 228 |
time to explain badly written code. |
|---|
| 229 |
|
|---|
| 230 |
Generally, you want your comments to tell WHAT your code does, not HOW. |
|---|
| 231 |
Also, try to avoid putting comments inside a function body: if the |
|---|
| 232 |
function is so complex that you need to separately comment parts of it, |
|---|
| 233 |
you should probably go back to chapter 5 for a while. You can make |
|---|
| 234 |
small comments to note or warn about something particularly clever (or |
|---|
| 235 |
ugly), but try to avoid excess. Instead, put the comments at the head |
|---|
| 236 |
of the function, telling people what it does, and possibly WHY it does |
|---|
| 237 |
it. |
|---|
| 238 |
|
|---|
| 239 |
|
|---|
| 240 |
Chapter 8: You've made a mess of it |
|---|
| 241 |
|
|---|
| 242 |
That's OK, we all do. You've probably been told by your long-time Unix |
|---|
| 243 |
user helper that "GNU emacs" automatically formats the C sources for |
|---|
| 244 |
you, and you've noticed that yes, it does do that, but the defaults it |
|---|
| 245 |
uses are less than desirable (in fact, they are worse than random |
|---|
| 246 |
typing - an infinite number of monkeys typing into GNU emacs would never |
|---|
| 247 |
make a good program). |
|---|
| 248 |
|
|---|
| 249 |
So, you can either get rid of GNU emacs, or change it to use saner |
|---|
| 250 |
values. To do the latter, you can stick the following in your .emacs file: |
|---|
| 251 |
|
|---|
| 252 |
(defun linux-c-mode () |
|---|
| 253 |
"C mode with adjusted defaults for use with the Linux kernel." |
|---|
| 254 |
(interactive) |
|---|
| 255 |
(c-mode) |
|---|
| 256 |
(c-set-style "K&R") |
|---|
| 257 |
(setq tab-width 8) |
|---|
| 258 |
(setq indent-tabs-mode t) |
|---|
| 259 |
(setq c-basic-offset 8)) |
|---|
| 260 |
|
|---|
| 261 |
This will define the M-x linux-c-mode command. When hacking on a |
|---|
| 262 |
module, if you put the string -*- linux-c -*- somewhere on the first |
|---|
| 263 |
two lines, this mode will be automatically invoked. Also, you may want |
|---|
| 264 |
to add |
|---|
| 265 |
|
|---|
| 266 |
(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode) |
|---|
| 267 |
auto-mode-alist)) |
|---|
| 268 |
|
|---|
| 269 |
to your .emacs file if you want to have linux-c-mode switched on |
|---|
| 270 |
automagically when you edit source files under /usr/src/linux. |
|---|
| 271 |
|
|---|
| 272 |
But even if you fail in getting emacs to do sane formatting, not |
|---|
| 273 |
everything is lost: use "indent". |
|---|
| 274 |
|
|---|
| 275 |
Now, again, GNU indent has the same brain-dead settings that GNU emacs |
|---|
| 276 |
has, which is why you need to give it a few command line options. |
|---|
| 277 |
However, that's not too bad, because even the makers of GNU indent |
|---|
| 278 |
recognize the authority of K&R (the GNU people aren't evil, they are |
|---|
| 279 |
just severely misguided in this matter), so you just give indent the |
|---|
| 280 |
options "-kr -i8" (stands for "K&R, 8 character indents"), or use |
|---|
| 281 |
"scripts/Lindent", which indents in the latest style. |
|---|
| 282 |
|
|---|
| 283 |
"indent" has a lot of options, and especially when it comes to comment |
|---|
| 284 |
re-formatting you may want to take a look at the man page. But |
|---|
| 285 |
remember: "indent" is not a fix for bad programming. |
|---|
| 286 |
|
|---|
| 287 |
|
|---|
| 288 |
Chapter 9: Configuration-files |
|---|
| 289 |
|
|---|
| 290 |
For configuration options (arch/xxx/Kconfig, and all the Kconfig files), |
|---|
| 291 |
somewhat different indentation is used. |
|---|
| 292 |
|
|---|
| 293 |
Help text is indented with 2 spaces. |
|---|
| 294 |
|
|---|
| 295 |
if CONFIG_EXPERIMENTAL |
|---|
| 296 |
tristate CONFIG_BOOM |
|---|
| 297 |
default n |
|---|
| 298 |
help |
|---|
| 299 |
Apply nitroglycerine inside the keyboard (DANGEROUS) |
|---|
| 300 |
bool CONFIG_CHEER |
|---|
| 301 |
depends on CONFIG_BOOM |
|---|
| 302 |
default y |
|---|
| 303 |
help |
|---|
| 304 |
Output nice messages when you explode |
|---|
| 305 |
endif |
|---|
| 306 |
|
|---|
| 307 |
Generally, CONFIG_EXPERIMENTAL should surround all options not considered |
|---|
| 308 |
stable. All options that are known to trash data (experimental write- |
|---|
| 309 |
support for file-systems, for instance) should be denoted (DANGEROUS), other |
|---|
| 310 |
experimental options should be denoted (EXPERIMENTAL). |
|---|
| 311 |
|
|---|
| 312 |
|
|---|
| 313 |
Chapter 10: Data structures |
|---|
| 314 |
|
|---|
| 315 |
Data structures that have visibility outside the single-threaded |
|---|
| 316 |
environment they are created and destroyed in should always have |
|---|
| 317 |
reference counts. In the kernel, garbage collection doesn't exist (and |
|---|
| 318 |
outside the kernel garbage collection is slow and inefficient), which |
|---|
| 319 |
means that you absolutely _have_ to reference count all your uses. |
|---|
| 320 |
|
|---|
| 321 |
Reference counting means that you can avoid locking, and allows multiple |
|---|
| 322 |
users to have access to the data structure in parallel - and not having |
|---|
| 323 |
to worry about the structure suddenly going away from under them just |
|---|
| 324 |
because they slept or did something else for a while. |
|---|
| 325 |
|
|---|
| 326 |
Note that locking is _not_ a replacement for reference counting. |
|---|
| 327 |
Locking is used to keep data structures coherent, while reference |
|---|
| 328 |
counting is a memory management technique. Usually both are needed, and |
|---|
| 329 |
they are not to be confused with each other. |
|---|
| 330 |
|
|---|
| 331 |
Many data structures can indeed have two levels of reference counting, |
|---|
| 332 |
when there are users of different "classes". The subclass count counts |
|---|
| 333 |
the number of subclass users, and decrements the global count just once |
|---|
| 334 |
when the subclass count goes to zero. |
|---|
| 335 |
|
|---|
| 336 |
Examples of this kind of "multi-level-reference-counting" can be found in |
|---|
| 337 |
memory management ("struct mm_struct": mm_users and mm_count), and in |
|---|
| 338 |
filesystem code ("struct super_block": s_count and s_active). |
|---|
| 339 |
|
|---|
| 340 |
Remember: if another thread can find your data structure, and you don't |
|---|
| 341 |
have a reference count on it, you almost certainly have a bug. |
|---|
| 342 |
|
|---|
| 343 |
|
|---|
| 344 |
Chapter 11: Macros, Enums, Inline functions and RTL |
|---|
| 345 |
|
|---|
| 346 |
Names of macros defining constants and labels in enums are capitalized. |
|---|
| 347 |
|
|---|
| 348 |
#define CONSTANT 0x12345 |
|---|
| 349 |
|
|---|
| 350 |
Enums are preferred when defining several related constants. |
|---|
| 351 |
|
|---|
| 352 |
CAPITALIZED macro names are appreciated but macros resembling functions |
|---|
| 353 |
may be named in lower case. |
|---|
| 354 |
|
|---|
| 355 |
Generally, inline functions are preferable to macros resembling functions. |
|---|
| 356 |
|
|---|
| 357 |
Macros with multiple statements should be enclosed in a do - while block: |
|---|
| 358 |
|
|---|
| 359 |
#define macrofun(a, b, c) \ |
|---|
| 360 |
do { \ |
|---|
| 361 |
if (a == 5) \ |
|---|
| 362 |
do_this(b, c); \ |
|---|
| 363 |
} while (0) |
|---|
| 364 |
|
|---|
| 365 |
Things to avoid when using macros: |
|---|
| 366 |
|
|---|
| 367 |
1) macros that affect control flow: |
|---|
| 368 |
|
|---|
| 369 |
#define FOO(x) \ |
|---|
| 370 |
do { \ |
|---|
| 371 |
if (blah(x) < 0) \ |
|---|
| 372 |
return -EBUGGERED; \ |
|---|
| 373 |
} while(0) |
|---|
| 374 |
|
|---|
| 375 |
is a _very_ bad idea. It looks like a function call but exits the "calling" |
|---|
| 376 |
function; don't break the internal parsers of those who will read the code. |
|---|
| 377 |
|
|---|
| 378 |
2) macros that depend on having a local variable with a magic name: |
|---|
| 379 |
|
|---|
| 380 |
#define FOO(val) bar(index, val) |
|---|
| 381 |
|
|---|
| 382 |
might look like a good thing, but it's confusing as hell when one reads the |
|---|
| 383 |
code and it's prone to breakage from seemingly innocent changes. |
|---|
| 384 |
|
|---|
| 385 |
3) macros with arguments that are used as l-values: FOO(x) = y; will |
|---|
| 386 |
bite you if somebody e.g. turns FOO into an inline function. |
|---|
| 387 |
|
|---|
| 388 |
4) forgetting about precedence: macros defining constants using expressions |
|---|
| 389 |
must enclose the expression in parentheses. Beware of similar issues with |
|---|
| 390 |
macros using parameters. |
|---|
| 391 |
|
|---|
| 392 |
#define CONSTANT 0x4000 |
|---|
| 393 |
#define CONSTEXP (CONSTANT | 3) |
|---|
| 394 |
|
|---|
| 395 |
The cpp manual deals with macros exhaustively. The gcc internals manual also |
|---|
| 396 |
covers RTL which is used frequently with assembly language in the kernel. |
|---|
| 397 |
|
|---|
| 398 |
|
|---|
| 399 |
Chapter 12: Printing kernel messages |
|---|
| 400 |
|
|---|
| 401 |
Kernel developers like to be seen as literate. Do mind the spelling |
|---|
| 402 |
of kernel messages to make a good impression. Do not use crippled |
|---|
| 403 |
words like "dont" and use "do not" or "don't" instead. |
|---|
| 404 |
|
|---|
| 405 |
Kernel messages do not have to be terminated with a period. |
|---|
| 406 |
|
|---|
| 407 |
Printing numbers in parentheses (%d) adds no value and should be avoided. |
|---|
| 408 |
|
|---|
| 409 |
|
|---|
| 410 |
Chapter 13: References |
|---|
| 411 |
|
|---|
| 412 |
The C Programming Language, Second Edition |
|---|
| 413 |
by Brian W. Kernighan and Dennis M. Ritchie. |
|---|
| 414 |
Prentice Hall, Inc., 1988. |
|---|
| 415 |
ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback). |
|---|
| 416 |
URL: http://cm.bell-labs.com/cm/cs/cbook/ |
|---|
| 417 |
|
|---|
| 418 |
The Practice of Programming |
|---|
| 419 |
by Brian W. Kernighan and Rob Pike. |
|---|
| 420 |
Addison-Wesley, Inc., 1999. |
|---|
| 421 |
ISBN 0-201-61586-X. |
|---|
| 422 |
URL: http://cm.bell-labs.com/cm/cs/tpop/ |
|---|
| 423 |
|
|---|
| 424 |
GNU manuals - where in compliance with K&R and this text - for cpp, gcc, |
|---|
| 425 |
gcc internals and indent, all available from http://www.gnu.org |
|---|
| 426 |
|
|---|
| 427 |
WG14 is the international standardization working group for the programming |
|---|
| 428 |
language C, URL: http://std.dkuug.dk/JTC1/SC22/WG14/ |
|---|
| 429 |
|
|---|
| 430 |
-- |
|---|
| 431 |
Last updated on 16 February 2004 by a community effort on LKML. |
|---|