Proper Linux Kernel Coding Style

Some simple rules, written and unwritten, to ensure that other kernel users understand your work.

As more people and companies start to
write Linux kernel code, understanding acceptable kernel
programming coding style and conventions is very important. This
article starts out by explaining the written kernel coding style,
as documented in the kernel's Documentation/CodingStyle file, and
then moves on to introduce some of the “unwritten” kernel coding
style rules.

Why Rules?

Why are there kernel programming style rules in the first
place? Why not just let authors code in whatever style they want,
and let everyone live with it? After all, code formatting does not
affect memory use, execution speed or anything else a normal user
of the kernel would see. The reason for rules can be summed up with
this passage from Elliot Soloway and Kate Ehrlich's article (see
Resources):

Our studies support the claim that knowledge of
programming plans and rules of programming discourse can have a
significant impact on program comprehension. In their book called
Elements of Style, Kernighan and Plauger also
identify what we would call discourse rules. Our empirical results
put teeth into these rules: it is not merely a matter of aesthetics
that programs should be written in a particular style. Rather there
is a psychological basis for writing programs in a conventional
manner: programmers have strong expectations that other programmers
will follow these discourse rules. If the rules are violated, then
the utility afforded by the expectations that programmers have
built up over time is effectively nullified.

There have been a number of other studies about how a large
body of code written in a common style directly affects how easy it
is to understand the code, review it and revise it quickly.

Because the number of developers that look at the Linux
kernel code is very large, it is in the best interest for the
project to have a consistent style guideline. This allows the code
to be easier to understand by both someone reading it for the first
time and for someone revisiting his or her old code a number of
months later. It also allows someone else to read easily,
understand and potentially fix and enhance your code, which is one
of the greatest strengths of open-source code.

What Are the Rules?

So now that we have an understanding of why there should be
rules, what are they? Linus Torvalds and other kernel programmers
have written a short document that details some of the kernel
programming rules. This document is located in the
Documentation/CodingStyle file in the kernel source tree. It is
required reading for anyone who wants to contribute to the Linux
kernel. The following is a summary of these rules.

Indentation

All tabs are eight characters and will be the <TAB>
character. This makes locating where different blocks of code start
and end easier. If you find that your code is indented too deeply,
with more than three levels of indentation that cause the code to
shift to the right of the screen too far, then you should fix
this.

Placing Braces

The original authors of UNIX placed their braces with the
opening brace last on the line, and the closing brace first on the
line:

if (x is true) {
we do y
}

Therefore, the Linux kernel uses this style. The exceptions
to this rule are functions, which have the opening brace at the
beginning of the line:

int function(int x)
{
body of function
}

Again, this is how Kernighan and Ritchie wrote their code.

For good examples of proper indentation and brace style, look
at any of the fs/*.c files or anything in the kernel/*.c files.
Generally, most of the kernel is in the proper indentation and
brace style, but there are some notable exceptions. The code in
fs/devfs/*.c and drivers/scsi/qla1280.* are good examples of how
not to do indentation and braces.

A script can be used to run the indent program if you have to
convert a large amount of code to the correct format. This file is
located at scripts/Lindent in the kernel source tree.

Naming

Your variables and functions should be descriptive and
concise. Don't use long flowery names like,
CommandAllocationGroupSize or
DAC960_V1_EnableMemoryMailboxInterface() but rather, cmd_group_size
or enable_mem_mailbox(). Mixed-case names are frowned upon, and
encoding the type of the variable or function in the name (like
“Hungarian notation”) is forbidden.

Global variables should be used only if they are absolutely
necessary. Local variables should be short and to the point. Valid
local loop variable names include “i” and “j”, while
“loop_counter” would be too verbose; “tmp” is allowed for any
short-lived temporary variable.

Again, good examples of proper names can be found in fs/*.c.
A lot of driver code has bad variable names, as it has been ported
from other operating systems. Examples of how
not to name functions and variables include
drivers/block/DAC960.* and drivers/scsi/cpqfs*.

I need one information:
What are the "coding standards" used for developing a linux device driver.
If there are standards where can i get all the standards details?
If those are available in any website .Please let me know.

I personally think the 'ifdef' in the C code is vastly superior. For one thing, it makes it obvious to anyone looking at the code that the function is only called if CONFIG_USB_HIDDEV is defined. For another thing, the proposed 'improvement' requires someone to look at the header file to understand the C code, rather than having the C code be self-documenting.

The claimed benefit that the code is more readable is the opposite of the truth. The original code makes it clear that nothing is done if CONFIG_USB_HIDDEV is not defined. The 'improved' code obscures that.

As for it being easier to maintain, I presume this is because the selection of the appropriate hid function is now part of the hid code rather than scattered in all the code that might call the hid code. I do agree that there is some value to this, but weighed against the fact that it obscured under what cases the function does something and under what cases it does not, I think it's just not worth it.

Especially if you try to generalize this. For example, what if 'HID_CLAIMED_HIDDEV' is not defined unless CONFIG_USB_HIDDEV is defined. Would you suggest a 'hid_is_claimed' macro that always returns 'true' or 'false' if 'CONFIG_USB_HIDDEV' is not declared? But then which should it return if it's not even supposed to exist as a test?

And what if instead of 'hid->claimed' it was something like 'is_claimed(hid)'? There might be costs associated with the empty function, and when you put that empty function in the header file, you do not necessarily know about every case that might call it.

This can lead to the ultimate ugliness -- code that has no side-effects around it will call the null function. But code that has side-effects or doesn't want the cost associated with deciding whether or not to call the null function might still need an 'ifdef'. What an unsightly mess that would be.

I see you've done a lot of thinking on this problem, and come up with some logical ideas, but I suspect you've never worked on code with lot's of #ifdefs, especially that needs to be portable to a number of different systems. If there are one or two, it is fine, the problem is when you get a bunch of them they are really hard to read. They don't follow normal indentation rules, so it is hard to see where they end, or what '#else' goes with which '#if'. Scary stuff.

Just curious...in the scenario described above for moving the #ifdef out of the c and into the h files....it comments about making a null function.

My question is...Does this when executed still make a function call during execution? Wouldn't this cause a few extra clock ticks that could be done elsewhere? Or does the compiler optimize this out of the equation?

And what if you have, say 3 lines of code, ifdefed out. Not only function calls (those can be ifdefed in a header file) but also assignments? Should those 3 lines be moved to a new function, that would be ifdefed away in its header?