Difference between revisions of "Coding Style"
(11 intermediate revisions by the same user not shown) | |||
Line 89: | Line 89: | ||
:However, the block is intimately related to the if/for/while in ways that a normal block is not. | :However, the block is intimately related to the if/for/while in ways that a normal block is not. | ||
=== Functions vs methods, i.e. member vs non-member functions === | |||
* Information-hiding principle suggests that if a function doesn't need access to private variables then it's better not to make it a member function. | |||
** In more detail: when one wants to change some detain about a private variable, then one must look through everything that has access to that variable. This is easier if there's less code that has access to the variable, e.g. if one uses non-member functions in preference to member-functions. | |||
* Virtual functions must of course be member functions. | |||
* Trivial getter/setter functions seem most natural as member functions. (This is mostly a matter of style or tradition rather than technical reasons.) | |||
** "barType getBar() const { return _bar; }" / "void setBar(barType x) { _bar = x; }" are trivial. It's not clear how much else is to be considered trivial. | |||
* For things needing access to private variables, the requirement of the `friend' declaration for non-member functions is a cost: extra clutter in the class definition, requires updating for changes in the signature; and many of the arguments for non-member functions disappear. | |||
* Member functions must be declared in the class definition, which tends to require more recompilation because one can't use a separate header file. | |||
** Sometimes one can break the class into smaller, simpler classes if there are lots of member functions. | |||
* A related issue is that it's harder (impossible) to write private specializations (same name but subclass arguments): any specialization must be declared in the class definition too, so it isn't private other than through comments. | |||
* The shorter names typical of member functions are harder to search for. | |||
** (Counter-argument: this is a matter of custom, there's no technical reason why one can't use long names for member functions, or even short names for non-member functions.) | |||
** Someone has claimed that some tools do relatively well at finding the right version of a given name. Can someone give examples of such a tool (and some comment as to how widely used that tool is or can be by our developers) ? | |||
* Pointers to member functions aren't as easily used as pointers to non-member functions. ("unwieldy", "heavy-weight".) | |||
* Member functions can't be given static linkage. | |||
=== What to put in what header file === | |||
Reasons for change: | |||
* Reduce compilation times (most important) | |||
* Fix the mess of NR::Point, NR::Matrix etc. header files. Less confusion as to what needs to be #included, or where to look for the definition of something (especially in the cases where tags-like programs don't help, e.g. if looking for something helpful for a known task rather than having a known name). | |||
What won't change: | |||
* foo.h will still provide the same things it always has (unless you count things "accidentally" provided, which we hope to become less). It's just that some content may be physically moved to a file that foo.h #includes. | |||
The issues are: | |||
* Ease for callers (what file/s to #include). This is largely taken care of by having files that #include other header files, and in particular having the foo.h file as at present. | |||
* "I need to change the blah declaration/definition; what file do I edit?". | |||
* Information we want ppl to be aware of. | |||
Definitions are handled by tags/idutils/global etc (though all of these tools have limitations in their ability to find the _right_ definition). | |||
Declarations that are separate from the definition: [non-member] function declarations are the main case. member functions (obviously kept in same physical file as their class definition); non-member functions | |||
Here are some types of things to be found in header files. Whatever rules you hypothesize, check that the rules cover each of these cases. | |||
* class foo definition | |||
* "class foo;" forward declarations | |||
* struct foo_class (i.e. GObject "class"/vtable type) | |||
* IS_FOO, FOO cast etc. macros | |||
* ancilliary types (e.g. struct StopOnTrue for marshalling) | |||
* enums | |||
* relevant instantiations (traits etc.) | |||
* foo_do_blah (declaration of function defined in .cpp file) | |||
* inline function definitions | |||
* should declaration & definition of inline functions go in different files (there are pros & cons) | |||
* other constants, including #defines | |||
* other macros. Where is the line between "other macro" and "inline function" and IS_FOO macros? | |||
The physical placement of the IS_FOO etc. macros (as distinct from what header files "provide" them) isn't important: we never need to edit them, and they're relatively easy to find by either primitive tags-like programs or by grep. | |||
* This unimportance-of-physical-location is reflected in the inconsistency of current practice, sometimes placing them with foo.h and sometimes in blah-forward.h. | |||
* Exception to unimportance-of-physical-location: IS_FOO etc. should not be physically in the same header file as something that's "expensive" and unneeded by one or more translation units that need IS_FOO etc. ("Expensive" in same sense as `amount' defined below in `Reducing compile times'.) | |||
The main thing for IS_FOO etc. is what header file(s) provide the macro. Thankfully, this is easy in the common case: usually any function that needs the definition of IS_FOO also needs lots of other things relevant to the foo type, so would simply #include foo.h. | |||
Relevant properties of function declarations: | |||
* It is common to add or remove functions, or change the arguments. | |||
* Typically not found by tags-like programs. E.g. neither `tags' nor `TAGS' files store the location of declarations (other than the definition). | |||
* Programmers want to know what functions (including methods, macros, inline things) are available (or rather "is there something that does blah"). | |||
** Probably doxygen or the like is the best task for this, though we'd need to make it convenient to use. | |||
* Needed in advance of function definitions (whether in .cpp file or inline function definitions). Not needed in advance by macro definitions per se (but needed in advance of function definitions where the macro is expanded). | |||
Inline function definitions: | |||
* Relatively expensive in compile time for translation units that don't use that inline function definition. | |||
* Must come after (in the translation unit) all referenced declarations. Sometimes this requires that it be outside of the class definition or in a separate file from related declarations. (E.g. if foo.h and bar.h each provide an inline function definition that requires declarations provided by the other file.) | |||
Operator overloading: | |||
* Can be unclear whether to associate with the type of the left operand or the type of the right operand. | |||
* We can simplify things for users of the function by having both left.h and right.h provide it (i.e. have both left.h and right.h #include it). This suggests that it should be physically neither in left.h nor right.h (unless it's OK for left.h to pull in all of right.h or vice versa). | |||
Reducing compile times | |||
* For each translation unit U, reduce the amount of material that gets included by U and isn't needed by U. | |||
** "Amount" is measured by cost: how often that material changes (causing a recompile), and how expensive it is to compile (e.g. inline function definitions). | |||
* In what circumstances does a translation unit include things that it doesn't need? | |||
** Inline function definitions. | |||
** Needs a function declaration but doesn't need the type definition. | |||
*** A special [non-] case of this is needing a member function declaration. Member function declarations can't be separated from their class definition. See also `methods vs functions' above. | |||
** Suppose that struct Foo has a `Bar _bar' member. The translation unit may need to create Foo objects (and thus need the type definition of Bar to know its size), but not need to access _bar (so not needing anything from bar.h other than the type definition of Bar). | |||
*** This is very common. Common examples of Bar: SPShape, NR::Point/NR::Matrix. | |||
** Needs some function declarations but not others. | |||
** Needs some type forward declarations but not others. | |||
** Comments are never needed for compilation. | |||
** Definition of a marshaller is needed only by emitters. However, it's difficult to separate the marshaller from the signal, and difficult to separate the signal from the definition of the containing type (SPDesktop, SPItem). | |||
** Including the definition of a type when its declaration would suffice. | |||
** Copy&paste of #include lines. | |||
* Ways of reducing the need for type definitions: | |||
** Store just a pointer instead of the whole type. | |||
** Provide a wrapper function (not inlined) for `new Foo'. |
Revision as of 07:34, 26 October 2004
Inkscape Coding Style Discussion
The <a href="/doc/coding_style.php">official code style documentation</a> is on the main website.
This page is for discussing and working out changes and improvements to that document. When concensus is reached, the document can be updated in CVS int he inkscape_web module.
Tabs and Alignment
- No tabs in files.
- Tabs are evil.
- By default many editors (and pagers as used for viewing diffs, and "diff -t") set tab stops to 8.
- If files end up with mixed space & tab alignment (other than in a constrained way such as suggested below by mental), then it breaks formatting to use anything but 8 as one's editor's tab stop size.
- A minor disadvantage of using spaces for indentation is that some text editors (not emacsen or vim) would require
- using the space key a lot. On the other hand, code written in such an editor probably requires reformatting anyway.
- Observing a rule of using tabs for structural indentation, and spaces for cosmetic (e.g. continuation) indentation, works
- fine regardless of a particular user's tab stops. -- MenTaLguY
- Somewhat contrived example (as seen with an 8-character tab stop):
if (gromzalon) { imagine_that_this_is_very_long(some, arguments, go, here); }
- The breakdown:
<=Tab====>if (gromzalon) { <=Tab====><=Tab====>imagine_that_this_is_very_long(some, arguments, <=Tab====><=Tab====><=Spaces======================>go, here); <=Tab====>}
- What this looks like with a tab stop of 3:
if (gromzalon) { imagine_that_this_is_very_long(some, arguments, go, here); }
- -- MenTaLguY
- (By the way, anyone submitting code with identifiers like "gromzalon" or "sprunklefrink" will be summarily shot.)
- That spacing will often be undone [by editors themeslves, if not users]-- [JonCruz Jon C.]
<=Spaces======================>go, here
will get mangled by most editors. Often like this:
<=Tab=Tab=Tab=Tab=Spaces=>go, here
emacs21 mode
The following should get emacs users started. Put the following in your ~/.emacs file and load it:
(setq c-basic-offset 4 c-offsets-alist '((innamespace . -2)(substatement-open 0)) c-brace-offset 0)
Some additional emacs21 notes:
- .h files load C mode, not C++ mode. Hit M-x and type c++-mode and it loads c++-mode.
- C-M-\ indents a buffer and converts all tabs to whitespace.
- Can anyone give a specification for common editors or astyle/indent to give this behaviour, or suggest a script to check where this hasn't been done?
- Without such automation, this approach won't be used by many developers.
- Suggesting that people change their tab stop size is only practical if all source code uses spaces & tabs the same way.
- This may be possible with mode lines if all Inkscape developers stick to editors with appropriate modeline support (emacsen, vim/gvim; anything else?).
- Do any popular Windows/MacOS editors (other than the above) support mode lines?
- What about KDevelop/anjuta/other IDE's on Un*x?
- It's fairly easy to define this behavior for Emacs and VIM. Having it explicit might be enough.
- It's possible to add checkin scripts to CVS to warn/prevent checkins with tabs, etc.
- Something else to consider is developer culture -- _most_ of our developers at this point will likely be coming from GNU, GNOME or Linux backgrounds, where using 8-column tabs for indention is the norm (and indeed part of the coding standards).
- Objective studies have found that indentions between 2 and 6 columns are optimal.
- Reference: [ProgramIndentation Miara, J. R., J. A. Musselman, J. A. Navarro, and B. Shneiderman. Program
Indentation and Comprehensibility. Comm. ACM 26, 11 (Nov. 1983), 861-867.]
Brace Placement
Arguments against "disco"/compact braces:
- Make it hard to match braces visually.
- Partial counter-argument: emacsen & vim each have brace-locating commands. (Emacsen: C-M-u, C-M-n, C-M-p, C-M-d (up/next/prev/down); vim: `[ {', `] }'.)
- The brace closes a block, not an if "statement".
- However, the block is intimately related to the if/for/while in ways that a normal block is not.
Functions vs methods, i.e. member vs non-member functions
- Information-hiding principle suggests that if a function doesn't need access to private variables then it's better not to make it a member function.
- In more detail: when one wants to change some detain about a private variable, then one must look through everything that has access to that variable. This is easier if there's less code that has access to the variable, e.g. if one uses non-member functions in preference to member-functions.
- Virtual functions must of course be member functions.
- Trivial getter/setter functions seem most natural as member functions. (This is mostly a matter of style or tradition rather than technical reasons.)
- "barType getBar() const { return _bar; }" / "void setBar(barType x) { _bar = x; }" are trivial. It's not clear how much else is to be considered trivial.
- For things needing access to private variables, the requirement of the `friend' declaration for non-member functions is a cost: extra clutter in the class definition, requires updating for changes in the signature; and many of the arguments for non-member functions disappear.
- Member functions must be declared in the class definition, which tends to require more recompilation because one can't use a separate header file.
- Sometimes one can break the class into smaller, simpler classes if there are lots of member functions.
- A related issue is that it's harder (impossible) to write private specializations (same name but subclass arguments): any specialization must be declared in the class definition too, so it isn't private other than through comments.
- The shorter names typical of member functions are harder to search for.
- (Counter-argument: this is a matter of custom, there's no technical reason why one can't use long names for member functions, or even short names for non-member functions.)
- Someone has claimed that some tools do relatively well at finding the right version of a given name. Can someone give examples of such a tool (and some comment as to how widely used that tool is or can be by our developers) ?
- Pointers to member functions aren't as easily used as pointers to non-member functions. ("unwieldy", "heavy-weight".)
- Member functions can't be given static linkage.
What to put in what header file
Reasons for change:
- Reduce compilation times (most important)
- Fix the mess of NR::Point, NR::Matrix etc. header files. Less confusion as to what needs to be #included, or where to look for the definition of something (especially in the cases where tags-like programs don't help, e.g. if looking for something helpful for a known task rather than having a known name).
What won't change:
- foo.h will still provide the same things it always has (unless you count things "accidentally" provided, which we hope to become less). It's just that some content may be physically moved to a file that foo.h #includes.
The issues are:
- Ease for callers (what file/s to #include). This is largely taken care of by having files that #include other header files, and in particular having the foo.h file as at present.
- "I need to change the blah declaration/definition; what file do I edit?".
- Information we want ppl to be aware of.
Definitions are handled by tags/idutils/global etc (though all of these tools have limitations in their ability to find the _right_ definition). Declarations that are separate from the definition: [non-member] function declarations are the main case. member functions (obviously kept in same physical file as their class definition); non-member functions
Here are some types of things to be found in header files. Whatever rules you hypothesize, check that the rules cover each of these cases.
- class foo definition
- "class foo;" forward declarations
- struct foo_class (i.e. GObject "class"/vtable type)
- IS_FOO, FOO cast etc. macros
- ancilliary types (e.g. struct StopOnTrue for marshalling)
- enums
- relevant instantiations (traits etc.)
- foo_do_blah (declaration of function defined in .cpp file)
- inline function definitions
- should declaration & definition of inline functions go in different files (there are pros & cons)
- other constants, including #defines
- other macros. Where is the line between "other macro" and "inline function" and IS_FOO macros?
The physical placement of the IS_FOO etc. macros (as distinct from what header files "provide" them) isn't important: we never need to edit them, and they're relatively easy to find by either primitive tags-like programs or by grep.
- This unimportance-of-physical-location is reflected in the inconsistency of current practice, sometimes placing them with foo.h and sometimes in blah-forward.h.
- Exception to unimportance-of-physical-location: IS_FOO etc. should not be physically in the same header file as something that's "expensive" and unneeded by one or more translation units that need IS_FOO etc. ("Expensive" in same sense as `amount' defined below in `Reducing compile times'.)
The main thing for IS_FOO etc. is what header file(s) provide the macro. Thankfully, this is easy in the common case: usually any function that needs the definition of IS_FOO also needs lots of other things relevant to the foo type, so would simply #include foo.h.
Relevant properties of function declarations:
- It is common to add or remove functions, or change the arguments.
- Typically not found by tags-like programs. E.g. neither `tags' nor `TAGS' files store the location of declarations (other than the definition).
- Programmers want to know what functions (including methods, macros, inline things) are available (or rather "is there something that does blah").
- Probably doxygen or the like is the best task for this, though we'd need to make it convenient to use.
- Needed in advance of function definitions (whether in .cpp file or inline function definitions). Not needed in advance by macro definitions per se (but needed in advance of function definitions where the macro is expanded).
Inline function definitions:
- Relatively expensive in compile time for translation units that don't use that inline function definition.
- Must come after (in the translation unit) all referenced declarations. Sometimes this requires that it be outside of the class definition or in a separate file from related declarations. (E.g. if foo.h and bar.h each provide an inline function definition that requires declarations provided by the other file.)
Operator overloading:
- Can be unclear whether to associate with the type of the left operand or the type of the right operand.
- We can simplify things for users of the function by having both left.h and right.h provide it (i.e. have both left.h and right.h #include it). This suggests that it should be physically neither in left.h nor right.h (unless it's OK for left.h to pull in all of right.h or vice versa).
Reducing compile times
- For each translation unit U, reduce the amount of material that gets included by U and isn't needed by U.
- "Amount" is measured by cost: how often that material changes (causing a recompile), and how expensive it is to compile (e.g. inline function definitions).
- In what circumstances does a translation unit include things that it doesn't need?
- Inline function definitions.
- Needs a function declaration but doesn't need the type definition.
- A special [non-] case of this is needing a member function declaration. Member function declarations can't be separated from their class definition. See also `methods vs functions' above.
- Suppose that struct Foo has a `Bar _bar' member. The translation unit may need to create Foo objects (and thus need the type definition of Bar to know its size), but not need to access _bar (so not needing anything from bar.h other than the type definition of Bar).
- This is very common. Common examples of Bar: SPShape, NR::Point/NR::Matrix.
- Needs some function declarations but not others.
- Needs some type forward declarations but not others.
- Comments are never needed for compilation.
- Definition of a marshaller is needed only by emitters. However, it's difficult to separate the marshaller from the signal, and difficult to separate the signal from the definition of the containing type (SPDesktop, SPItem).
- Including the definition of a type when its declaration would suffice.
- Copy&paste of #include lines.
- Ways of reducing the need for type definitions:
- Store just a pointer instead of the whole type.
- Provide a wrapper function (not inlined) for `new Foo'.