Coding rules
We have developed some coding rules (or conventions) for the project, mainly to make it easier for us to understand eachother's code. We ask you to adhere to these rules, even the ones you don't like very much, so that the whole Wonka codebase has some kind of consistent feel to it.
Linus Torvalds has this to say about the coding conventions of Linux:
This is a short document describing the preferred coding style for the linux kernel. Coding style is very personal, and I won't force my views on anybody, but this is what goes for anything that I have to be able to maintain, and I'd prefer it for most other things too. Please at least consider the points made here.
First off, I'd suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it's a great symbolic gesture.
Linus Torvalds, in linux/Documentation/CodingStyle
... and we agree. Read what follows in that frame of mind.
Fundamental Data Types
The C language explicitly leaves many details of implementation undefined, allowing the designer of a C compiler considerable freedom. In particular, the implementation of standard types such as int is left almost completely open. For this reason, in hal/cpu/<cpu>/include/processor.h, where <cpu> is the CPU type, we define a set of types which are used throughout Wonka:
Type name | Width etc. | Corresponding Java type | Notes |
---|---|---|---|
w_void | none (C void) | -- | |
w_boolean | any integer type | boolean | Packed bitwise in Java boolean array |
w_ubyte | 8 bit unsigned integer | -- | |
w_sbyte | 8 bit signed integer | byte | |
w_short | 16 bit signed integer | short | |
w_ushort | 16 bit unsigned integer | -- | |
w_char | 16 bit unsigned integer | char | |
w_int | 32 bit signed integer | int | |
w_word | 32 bit unsigned integer | -- | Used for Java fields, stack items, etc. |
w_size | 32 bit unsigned integer | -- | Used for sizes of objects etc. |
w_flags | 32 bit unsigned integer | -- | Used for words holding up to 32 1-bit flags. |
w_long | 64 bit signed integer | long | |
w_ulong | 64 bit unsigned integer | -- | |
w_float | 32 bit ieee754 floating-point number | float | |
w_double | 64 bit ieee754 floating-point number | double |
Comments and whitespace
Block comments are written like so:
/* ** My hovercraft is full of eels. */
Shorter, ``marginal'' comments can be written like so:
elementCount -= removeCount; /* Don't count the elements we will be removing */
There is no set upper limit on the width of a comment block, or on the width of a marginal comment, but please bear in mind that really long lines are really hard to read.
Any block comment that begins with `/**' (note the extra asterisk) will be automagically extracted into a section of the Reference Manual. Within such a comment block the following conventions apply:
- Use LaTeX markup.
- Each block should be a \subsection. (In the Reference Manual, the directory will be a \chapter and the file will be a \section).
- Use \texttt{...} for names of C fields, variables, functions, etc.. Use \textsf{...} to refer to Java classes and types. Use \textit{...} for emphasis.
- Note that any leading whitespace and the two asterisks at the start of each line of the block will be removed by the extracting program.
These documentation comments chiefly occur in header files, but they may also appear in program files if the implementation details deserve to be documented in the Reference Manual.
Comments and whitespace are like currants and airbubbles in a cake: you can have too few, and you can have too many. There's also no universally accepted rule to determine what constitutes ``too few'' or ``too many''. Remember that these things are supposed to make the code easier to understand: taken to excess, they can have the opposite effect.
Header (.h) Files
Each header file should begin as follows:
ifndef _ARRAY_H #define _ARRAY_H /************************************************************************** * Copyright (c) 2001 by Acunia N.V. All rights reserved. * * * * This software is copyrighted by and is the sole property of Acunia N.V. * * and its licensors, if any. All rights, title, ownership, or other * * interests in the software remain the property of Acunia N.V. and its * * licensors, if any. * * * * This software may only be used in accordance with the corresponding * * license agreement. Any unauthorized use, duplication, transmission, * * distribution or disclosure of this software is expressly forbidden. * * * * This Copyright notice may not be removed or modified without prior * * written consent of Acunia N.V. * * * * Acunia N.V. reserves the right to modify this software without notice. * * * * Acunia N.V. * * Vanden Tymplestraat 35 info@acunia.com * * 3000 Leuven http://www.acunia.com * * Belgium - EUROPE * **************************************************************************/ /* ** $Id: coding-rules.html,v 1.2 2001/11/27 15:53:21 gray Exp $ */
and end with:
#endif /* _ARRAY_H */
The #ifdef ... #define ... #endif functions ensure that nothing terrible happens if this header file gets included twice (hard to avoid in a large project). In the comment block which follows, the $Id: ... will automatically be updated by CVS to show the current revision and the date it was checked in. The rest of the comment block (which should of course be adapted to the particular case) reminds anyone reading this file that the code belongs to someone, and tells her where to look for the legal details. (See the licensing page). A header file should contain only macros and declarations (function prototypes and externs), not definitions. The only exception is the use of static inline functions as a kind of ``typesafe macro''.
Structure and type declarations
The following illustrates a typical Wonka structure declaration:
typedef struct w_Bubble *w_bubble; /** \subsection{w\_Bubble structure declaration} This is an example of a fictitious structure declaration in Wonka. Fields \texttt{next} and \texttt{previous} are links to the next and previous items in a doubly-linked list. Field \texttt{flags} holds some kind of funky bitmap, and \texttt{numElements} holds the number of elements in the variable-length array \texttt{elements}. */ typedef struct w_Bubble { w_bubble next; /* This is an informative comment */ w_bubble previous; w_flags flags; w_int numElements; /* Try to align comments whenever possible */ w_element *elements; } w_Bubble;
This simple example already illustrates several points.
- We define a struct with the tag w_Bubble, and in the same ``breath'' we typedef this to w_Bubble.
- We define a type w_bubble which points to a structure of type w_bubble. Apart from the fundamental types listed above, you can be pretty sure that a w_foo is a pointer to struct w_Foo. (In this case w_bubble is defined ahead of w_Bubble, because the latter contains pointers to other w_Bubbles).
- We always separate the declaration of a type from the definition of a variable of that type. (Even if there is only one variable of that type).
- We don't go for complicated field names such as bubble004_int_numElements.
In C each struct type defines its own namespace, so if bubba
is a w_bubble, bubba->flags is completely unambiguous,
even if there are many struct types having a field named flags.
In fact we try to always use the same name for the same thing:
- next, previous
- for a pointer to another struct of the same type, in the context of a linked list;
- flags
- for a word full of 1-bit flags;
- name
- for a w_string holding the Java-accessible name of something.
Function prototypes
Forward declarations in a header file should always be full prototypes, e.g.
void *findFrobnicator(void);
not justfindFrobnicator();
The declaration should be preceded by a block comment which describes what the function does and what the parameters and the value returned represent.
Macros, extern declarations
These should also be preceded by a brief block comment describing their use. In general, a programmer who needs to use the functions and data structures you have defined should be able to find everything she needs in the header file: the .c file contains only implementation details.
Program (.c) files
Relationship with header files
Generally there should be a one-to-one relationship between program and header files, i.e. the prototypes and interface documentation for all the functions which are exported by foo.c should be found in foo.h. However, exceptions can occur.
Variable definitions
Every variable which is used in Wonka needs to be defined exactly once. Generally this will happen in the .c file corresponding to the .c file in which it was declared.
Function definitions
Each function definition should be preceded by a short descriptive block comment. The emphasis here should be on how the function is implemented, since the interface is defined in the header file. If the function is complex, block or marginal comments may also appear in the function body.
The woempa macro can be defined either to give selective debugging information at runtime (DEBUG=yes) or to do nothing at all (DEBUG=no). Use of the woempa macro can replace explicit comments in the code: e.g.
if ((clazz->thread == thread) && isSet(clazz->flags,CLAZZ_BEING_PROCESSED) && (clazz->verify == alreadyVerified)) { woempa(1,"Thread %w is already busy preparing class %w\n",NM(thread),NM(clazz)); return NULL; }
This example also illustrates a number of other points:
- In complex conditionals, make use of (technically redundant) parentheses to make the binding of operators clear. This will also help prevent you from making mistakes when C's operator precedence rules are um, surprising.
- Put whitespace before and after the = of an assignment, and before and after aritmetic, comparison, and logical operators. (But not before and after ( ) [ ] . ->. Put a space after a comma, but not before).
- The opening { goes on the same line as the if, do, while etc., the closing bracket goes vertically below the start of that statement. The statements between brackets are indented by two spaces each time.
- A return statement other than at the very end of a function should always be offset with a blank line above and below, so it cannot be overlooked. The same applies for a break statement within a loop.
Some other rules:
- Always use curly brackets in an if, do, or while
statement, even when they are not strictly necessary.
if (result > max) result = max; /* WRONG */ if (result > max) { result = max; /* RIGHT */ }
If you need an else, do it like so:
if (a > b) { max = a; } else { max = b; }
- Don't use tabs. Ever. If your editor converts spaces to tabs, find out how to stop it doing that, or trash it.
- Don't try to do too much on one line. Don't e.g. hide assignments
in the middle of statements. Give each variable its own declaration.
w_int i,*ip; /* YUK */ result = (new = getNext(old)) ? new : old; /* UGH */