HAProxyHAProxy coding style for contributions |
|
Mirror Sites: Master Language: English |
Quick linksQuick NewsRecent News Introduction Indentation Alignment Braces Line breaks Spaces Parenthesis NULL processing Syscall returns Declarations Macros Includes Comments Assembly Contacts Download Documentation Live demo They use it! Commercial Support Products using HAProxy Add-on features Other Solutions External links Mailing list archives 10GbE load-balancing (updated) Contributions Known bugs Web Based User Interface HATop: Ncurses Interface Willy TARREAU You want to donate ? ![]() |
IntroductionA number of contributors are often embarrassed with coding style issues, they don't always know if they're doing it right, especially since the coding style has elvoved along the years. What is explained here is not necessarily what is applied in the code, but new code should as much as possible conform to this style. Coding style fixes happen when code is replaced. It is useless to send patches to fix coding style only, they will be rejected, unless they belong to a patch series which needs these fixes prior to get code changes. Also, please avoid fixing coding style in the same patches as functional changes, they make code review harder. When modifying a file, you must accept the terms of the license of this file which is recalled at the top of the file, or is explained in the LICENSE file, or if not stated, defaults to LGPL version 2.1 or later for files in the include directory, and GPL version 2 or later for all other files. When adding a new file, you must add a copyright banner at the top of the file with your real name, e-mail address and a reminder of the license. Contributions under incompatible licenses or too restrictive licenses might get rejected. If in doubt, please apply the principle above for existing files. Tbs in this document will be represented as a series of 8 spaces so that it displays the same everywhere. 1) Indentation and alignment1.1) IndentationIndentation and alignment are two completely different things that people often get wrong. Indentation is used to mark a sub-level in the code. A sub-level means that a block is executed in the context of another block (eg: a function or a condition) :
In the example above, the code belongs to the Note that there are places where the code was not properly indented in the past. In order to view it correctly, you may have to set your tab size to 8 characters. 1.2) AlignmentAlignment is used to continue a line in a way to makes things easier to group together. By definition, alignment is character-based, so it uses spaces. Tabs would not work because for one tab there would not be as many characters on all displays. For instance, the arguments in a function declaration may be broken into multiple lines using alignment spaces :
In this example, the "
If we take again the example above marking tabs with "
It is worth noting that some editors tend to confuse indentations and aligment. Emacs is notoriously known for this brokenness, and is responsible for almost all of the alignment mess. The reason is that Emacs only counts spaces, tries to fill as many as possible with tabs and completes with spaces. Once you know it, you just have to be careful, as alignment is not used much, so generally it is just a matter of replacing the last tab with 8 spaces when this happens. Indentation should be used everywhere there is a block or an opening brace. It is not possible to have two consecutive closing braces on the same column, it means that the innermost was not indented. Right :
Wrong :
A special case applies to switch/case statements. Due to my editor's settings,
I've been used to align "
2) BracesBraces are used to delimit multiple-instruction blocks. In general it is preferred to avoid braces around single-instruction blocks as it reduces the number of lines : Right :
Wrong :
But it is not that strict, it really depends on the context. It happens from time to time that single-instruction blocks are enclosed within braces because it makes the code more symmetrical, or more readable. Example :
Braces are always needed to declare a function. A function's opening brace must be placed at the beginning of the next line : Right :
Wrong :
Note that a large portion of the code still does not conforms to this rule, as it took years to me to adapt to this more common standard which I now tend to prefer, as it avoids visual confusion when function declarations are broken on multiple lines : Right :
Wrong :
Braces should always be used where there might be an ambiguity with the code
later. The most common example is the stacked " Dangerous code waiting of a victim :
Wrong change :
It will do this instead of what your eye seems to tell you :
Right :
Similarly dangerous example :
Wrong change to silent the annoying message :
... which in fact means :
3) Breaking linesThere is no strict rule for line breaking. Some files try to stick to the 80 column limit, but given that various people use various tab sizes, it does not make much sense. Also, code is sometimes easier to read with less lines, as it represents less surface on the screen (since each new line adds its tabs and spaces). The rule is to stick to the average line length of other lines. If you are working in a file which fits in 80 columns, try to keep this goal in mind. If you're in a function with 120-chars lines, there is no reason to add many short lines, so you can make longer lines. In general, opening a new block should lead to a new line. Similarly, multiple instructions should be avoided on the same line. But some constructs make it more readable when those are perfectly aligned : A copy-paste bug in the following construct will be easier to spot :
than in this one :
What is important is not to mix styles. For instance there is nothing wrong
with having many one-line "
Otherwise, prefer to have the " Right :
Wrong :
Right :
or Right :
but Wrong :
When complex conditions or expressions are broken into multiple lines, please do ensure that alignment is perfectly appropriate, and group all main operators on the same side (which you're free to choose as long as it does not change for every block. Putting binary operators on the right side is preferred as it does not mangle with alignment but various people have their preferences. Right :
Right :
Wrong :
If it makes the result more readable, parenthesis may even be closed on their own line in order to align with the opening one. Note that should normally not be needed because such code would be too complex to be digged into. The " Right :
Right :
Right :
Wrong :
Wrong :
4) SpacingCorrectly spacing code is very important. When you have to spot a bug at 3am, you need it to be clear. When you expect other people to review your code, you want it to be clear and don't want them to get nervous when trying to find what you did. Always place spaces around all binary or ternary operators, commas, as well as after semi-colons and opening braces if the line continues : Right :
Wrong :
Never place spaces after unary operators ( Right :
Wrong :
Note that " Braces opening a block must be preceeded by one space unless the brace is placed on the first column : Right :
Wrong :
Do not add unneeded spaces inside parenthesis, they just make the code less readable. Right :
Wrong :
Language keywords must all be followed by a space. This is true for control
statements ( Right :
Wrong :
Function calls are different, the opening parenthesis is always coupled to the function name without any space. But spaces are still needed after commas : Right :
Wrong :
5) Excess or lack of parenthesisSometimes there are too many parenthesis in some formulas, sometimes there are too few. There are a few rules of thumb for this. The first one is to respect the compiler's advice. If it emits a warning and asks for more parenthesis to avoid confusion, follow the advice at least to shut the warning. For instance, the code below is quite ambiguous due to its alignment :
Note that this code does :
But maybe the author meant :
A second rule to put parenthesis is that people don't always know operators precedence too well. Most often they have no issue with operators of the same category (eg: booleans, integers, bit manipulation, assignment) but once these operators are mixed, it causes them all sort of issues. In this case, it is wise to use parenthesis to avoid errors. One common error concerns the bit shift operators because they're used to replace multiplies and divides but don't have the same precedence : The expression :
becomes :
which is wrong because it is equivalent to :
while the following was desired instead :
It is generally fine to write boolean expressions based on comparisons without any parenthesis. But on top of that, integer expressions and assignments should then be protected. For instance, there is an error in the expression below which should be safely rewritten : Wrong :
Right (may remove a few parenthesis depending on taste) :
The " Wrong :
Right :
Parenthesisis are also found in type casts. Type casting should be avoided as much as possible, especially when it concerns pointer types. Casting a pointer disables the compiler's type checking and is the best way to get caught doing wrong things with data not the size you expect. If you need to manipulate multiple data types, you can use a union instead. If the union is really not convenient and casts are easier, then try to isolate them as much as possible, for instance when initializing function arguments or in another function. Not proceeding this way causes huge risks of not using the proper pointer without any notification, which is especially true during copy-pastes. Wrong :
Right :
6) Ambiguous comparisons with zero or NULLIn C, ' For instance :
is easier to understand than :
For a char this "not" operator can be reminded as "no remaining char", and the
absence of comparison to zero implies existence of the tested entity, hence the
simple
Note the double parenthesis in order to avoid the compiler telling us it looks like an equality test. For a string or more generally any pointer, this test may be understood as an
existence test or a validity test, as the only pointer which will fail to
validate equality is the
However sometimes it can fool the reader. For instance,
strcmp(a, b) == 0 <=> a == b
strcmp(a, b) != 0 <=> a != b
strcmp(a, b) < 0 <=> a < b
strcmp(a, b) > 0 <=> a > b
Avoid this :
Prefer this :
7) System call returnsThis is not directly a matter of coding style but more of bad habits. It is important to check for the correct value upon return of syscalls. The proper return code indicating an error is described in its man page. There is no reason to consider wider ranges than what is indicated. For instance, it is common to see such a thing :
This is wrong. The man page says that 8) Declaring new types, names and valuesPlease refrain from using "
With the types declared in another file this way :
This cannot work because we're comparing a scalar with a struct, which does
not make sense. Without a
Declaring special values may be done using enums. Enums are a way to define structured integer values which are related to each other. They are perfectly suited for state machines. While the first element is always assigned the zero value, not everybody knows that, especially people working with multiple languages all the day. For this reason it is recommended to explicitly force the first value even if it's zero. The last element should be followed by a comma if it is planned that new elements might later be added, this will make later patches shorter. Conversely, if the last element is placed in order to get the number of possible values, it must not be followed by a comma and must be preceeded by a comment :
Structure names should be short enough not to mangle function declarations, and explicit enough to avoid confusion (which is the most important thing). Wrong :
Right :
When declaring new functions or structures, please do not use CamelCase, which is a style where upper and lower case are mixed in a single word. It causes a lot of confusion when words are composed from acronyms, because it's hard to stick to a rule. For instance, a function designed to generate an ISN (initial sequence number) for a TCP/IP connection could be called :
None is right, none is wrong, these are just preferences which might change along the code. Instead, please use an underscore to separate words. Lowercase is preferred for the words, but if acronyms are upcased it's not dramatic. The real advantage of this method is that it creates unambiguous levels even for short names. Valid examples :
Another example is easy to understand when 3 arguments are involved in naming the function : Wrong (naming conflict) :
Right (unambiguous naming) :
Whenever you manipulate pointers, try to declare them as " Right :
Wrong :
9) Getting macros rightIt is very common for macros to do the wrong thing when used in a way their
author did not have in mind. For this reason, macros must always be named with
uppercase letters only. This is the only way to catch the developer's eye when
using them, so that he double-checks whether he's taking risks or not. First,
macros must never ever be terminated by a semi-colon, or they will close the
wrong block once in a while. For instance, the following will cause a build
error before the "
Right :
If multiple instructions are needed, then use a
Second, do not put unprotected control statements in macros, they will definitely cause bugs : Wrong :
Which is equivalent to the undesired form below :
Right way to do it :
Which is equivalent to :
Macro parameters must always be surrounded by parenthesis, and must never be
duplicated in the same macro unless explicitly stated. Also, macros must not be
defined with operators without surrounding parenthesis. The Wrong :
What this will do :
Which is equivalent to :
The first thing to fix is to surround the macro definition with parenthesis to avoid this mistake :
But this is still not enough, as can be seen in this example :
Which is equivalent to :
Which in turn means a totally different thing due to precedence :
This can be fixed by surrounding *each* argument in the macro with parenthesis:
But this is still not enough, as can be seen in this example :
Which is equivalent to :
Again, this is wrong because "
At this point, using
10) IncludesIncludes are as much as possible listed in alphabetically ordered groups :
Each section is just visually delimited from the other ones using an empty line. The two first ones above may be merged into a single section depending on developer's preference. Please do not copy-paste include statements from other files. Having too many includes significantly increases build time and makes it hard to find which ones are needed later. Just include what you need and if possible in alphabetical order so that when something is missing, it becomes obvious where to look for it and where to add it.
All files should include Header files are split in two directories (" All headers which do not depend on anything currently go to the " Include files must be protected against multiple inclusion using the common
11) CommentsComments are preferably of the standard 'C' form using "
If multiple code lines need a short comment, try to align them so that you can have multi-line sentences. This is rarely needed, only for really complex constructs. Do not tell what you're doing in comments, but explain why you're doing it if
it seems not to be obvious. Also *do* indicate at the top of function what they
accept and what they don't accept. For instance, Wrong use of comments :
Right use of comments :
12) Use of assemblyThere are many projects where use of assembly code is not welcome. There is no problem with use of assembly in haproxy, provided that :
It is important to take care of various incompatibilities between compiler versions, for instance regarding output and cloberred registers. There are a number of documentations on the subject on the net. Anyway if you are fiddling with assembly, you probably know that already. Example :
Contacts
Feel free to contact me at for any questions or comments :
Some people regularly ask if it is possible to send donations, so I have set up a Paypal account for this. Click here if you want to donate. |