Difference between revisions of "Janitorial tasks"

From Inkscape Wiki
Jump to navigation Jump to search
 
(22 intermediate revisions by 6 users not shown)
Line 1: Line 1:
This is a quick list of tasks needing to be done, that no one has gotten around to yet.  They're not in any order yet, and simply have a note at the end with who added them to the list, so if you have questions, you can ask them.  (Or, ask the devel mailing list.) See also [[DeveloperManual]].
Before embarking on any of those tasks, be sure to contact the developers on the mailing list.


== Improving headers for compilation speed ==


== Inline Function Removal ==
* In src/xml/repr.h, there are a bunch of inline functions that are trivial wrappers for SPRepr member functions.  Your task, should you choose to accept it:  replace all calls to those wrapper functions with direct calls to the wrapped member functions, then remove the unused wrapper functions. (mental) (some patches have been applied from Robert Crosbie)
** mental says don't touch sp_repr_parent and sp_repr_next (mentioned sp_repr_prev, but it doesn't exist)
** sp_repr_children (same as above?)
** sp_repr_unparent (same as above?)
** sp_repr_ref, sp_repr_unref and sp_repr_set_attr are large conversions needing to be done.
** sp_repr_document, sp_repr_document_root (do we convert these?)
** sp_repr_add_child, sp_repr_remove_child and sp_repr_change_order are straight conversions.
== Class Design Consistency ==
* Adding default, private, un-defined copy ctor and operator = . As a rule, all C++ classes should have an explicit copy constructor and assignment operator (operator =). Instead of defaulting to the standard one, any classes that don't already have them should have the two declared in the "private:" section of the class, but never actually have those functions defined. ([[JonCruz]])
<pre>
class Foo {
  ...
private:
    Foo(const Foo& other); // no copy
    void operator=(const Foo& other); // no assign
};
</pre>
* When a class has a destructor, always make it virtual. Just to be on the safe side. See [http://www.parashift.com/c++-faq-lite/virtual-functions.html#faq-20.7].
<pre>
class Foo {
...
    virtual ~Foo();
...
};
</pre>
When the class does not have a dtor, but does have virtual methods; add virtual ~Foo() {}; for good measure :)
== Add Assertions ==
* Look for functions that are using pointer variables without checking against NULL first.  Add a call to <code>g_assert( )</code>, <code>g_return_if_fail( )</code>, etc.:
void inkscape_dispose(GObject * obj) {
    Inkscape::Application *inkscape = (Inkscape::Application *) object;
    g_assert(inkscape != NULL);  // <-- add this assertion
    while (inkscape->documents) {
        /* ... */
    }
}
(BPF) This could be contentious, and <code>g_assert( )</code> will nearly always be the wrong thing here. I would not recommend adding such assertions to
apparently good code, but it is certainly fair to assert before dereferencing a pointer in new code, with a view to removing the assertion once
the code is de-bugged. I may have become confused because if the function in question is known to (or better, documented to) require a non-NULL
pointer input then there should be a PRECONDITION macro at the head of the function which must provide correct run-time and release time behaviour.
This would be the case even if the first use of the pointer is hundreds of lines into the function. See Bug [ 1210100 ] "Selecting corrupted embedded images crashes".
== Improving Headers for Compilation Speed ==
When a header file changes, any file that includes it is recompiled.  This becomes a problem when a given header file is included by other header files, as it leads to a vast header include tree, wherein a small change to a seemingly minor header file causes massive rebuilds of much of the codebase.   
When a header file changes, any file that includes it is recompiled.  This becomes a problem when a given header file is included by other header files, as it leads to a vast header include tree, wherein a small change to a seemingly minor header file causes massive rebuilds of much of the codebase.   


Fortunately for us, there are recognised techniques to mitigate this!  Here's what to do:
Fortunately for us, there are recognized techniques to mitigate this!  Here's what to do:


* Find a header that includes several other headers.  For example, the header "bar.h" might have a <code>class Bar</code> with a member of type <code>Foo*</code>, thus it includes "foo.h" to get the definition of that type.  You can prune this by using a 'forward declaration', whenever the <code>Foo</code> entity is not itself used, only pointers to it (as stated):
* Find a header that includes several other headers.  For example, the header "bar.h" might have a <code>class Bar</code> with a member of type <code>Foo*</code>, thus it includes "foo.h" to get the definition of that type.  You can prune this by using a 'forward declaration', whenever the <code>Foo</code> entity is not itself used, only pointers and/or references to it:


  //#include "foo.h"  /* <-- kill the header include! */
  //#include "foo.h"  /* <-- kill the header include! */
Line 65: Line 14:
     Foo* _foo;
     Foo* _foo;
  };
  };
Although there are still forward declaration headers in the source tree, those should be avoided in favor of individual forward declarations in .h files as needed. '''The exception is <tt>&lt;2geom/forward.h&gt;</tt>''', due to heavy use of templates in the 2Geom library.


(BPF) The [http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-39.11 C++ FAQ Sheet] explains how to code forward declarations (classes that both need to know about each other); To fully understand why, you should study the [http://www.gotw.ca/publications/mill04.htm Pimpl idiom].
(BPF) The [http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-39.11 C++ FAQ Sheet] explains how to code forward declarations (classes that both need to know about each other); To fully understand why, you should study the [http://www.gotw.ca/publications/mill04.htm Pimpl idiom].


* Move class documentation from the .h to the .cpp.  We have an automated code documentation generator called Doxygen, that generates HTML docs from comments in the .h or .cpp files.  Sometimes people put the code docs in the .h, but this means that whenever you update the docs, everything including that .h has to be recompiled, even if there are no _real_ changes to the code. Instead, move all these comments into the corresponding .cpp file.  It's okay to have a short paragraph comment at the top of the .h file explaining what it is.
== Source formatting ==
 
=== Header ===
 
* Source files should use four spaces as indentation and no tabs. Trailing whitespace should be removed.
* The comment at the top of each file should have the following format. The author information is in a regular multiline comment so that it is omitted in the generated documentation. Author emails can be obfuscated, but should be real addresses.
** Using a Doxygen comments with <tt>@file</tt> at the top should no longer be the normal case. The comments need to document individual classes, subsystems, etc., and not be focused on file structure. Doxygen itself normally will address file-specific needs.


/*
  * Authors:
  *  J. God Hacker <ihatepizza@gurus.org>
  *  Ellen Epic <epicwin at email dot com>
  *
  * Copyright (C) 2006-2008 Authors
  * Released under GNU GPL, read the file 'COPYING' for more information
  */


== Cleanup: Whitespace ==
Again, note that the comment does not start with "<tt>/**</tt>",. but only with "<tt>/*</tt>"
* Tabs in the source just lead to many troubles, so they aren't supposed to be used. One problem is that just converting them to spaces introduces extra diffs in the CVS history, so someone wanting to remove tabs should first figure out where they are and then talk over an approach for cleansing them. ([[JonCruz]])


* Trailing whitespace is also a non-visible but diff-confusing issue, and should also eventually tracked down and removed. ([[JonCruz]])
==== @file Command ====
::See also http://permalink.gmane.org/gmane.comp.graphics.inkscape.devel/21241 which has a script for removing trailing whitespace. The thread talks about cleaning these when the area is touched anyway, but is against doing a full cleanup.


== Cleanup: Syntactical ==
* Modern C++ code should avoid global and static variables, functions, enums, etc. Legacy code migrated from C may still contain these, so will require the use of a <tt>@file</tt> command.
* Make sure all files include the standard copyright/license info -- be careful though!  Check with ALL the listed copyright holders before replacing a license header. If there is no header, track down the authors and get permission before. Note that a copyright notice to "The Inkscape Organization" is not valid; you will need to track down the original authors in that case too.
* Note that the <tt>@file</tt> command will only be required for files that have non-class non-namespaced globals or statics. As our codebase moves to the more modern C++ practices, use of these will be reduced and removed.
* Legacy files that contain a mix of functions probably warrant use of a <tt>@file</tt> command.
* If feasible, moving statics to anonymous namespaces instead is preferable to adding a <tt>@file</tt> command.
* Any documented entities in namespaces, of local classes, etc., will be processed even if a <tt>@file</tt> command is not present in the source file.


== Cleanup: Modelines ==
An example of a legacy source file using the <tt>@file</tt> command:
* Make sure all files include the emacs [http://www.delorie.com/gnu/docs/emacs/cc-mode_26.html Local Variable block] and a vim [http://www.vim.org/tips/tip.php?tip_id=331 modeline] at the end of the file. There is an example modeline on the [[Coding_Style]] page, but there are several versions in the codebase. Should they be made to agree?


== Documentation ==
/**
* Go through .h files and put a sentence or two comment at the top explaining what the class is.  Don't be too detailed; details belong in the comments in the corresponding .cpp file.
  * @file
  * Logarithmic time traveling salesman solver.
  */
/*
  * Authors:
  *  J. God Hacker <ihatepizza@gurus.org>
  *  Ellen Epic <epicwin at email dot com>
  *
  * Copyright (C) 2006-2008 Authors
  * Released under GNU GPL, read the file 'COPYING' for more information
  */


* Add comments to each function in .cpp files.  Pick a .cpp file and read through it. Before each function, add comments describing what the function does. See the files in the inkscape/extensions/ directory as examples. We want ALL of the Inkscape sourcecode documented like that.
Note the following:
* The opening doc comment is merely "<tt>/**</tt>" on a line by itself. Keeping the rest to subsequent lines aids legibility and revision tracking.
* The <tt>@file</tt> command is on a line by itself, with nothing following. This is required to allow Doxygen to automatically extract the current filename.
* The short description of the file contents (that follows starting on the line after <tt>@file</tt>) ends with a period. All short (aka "brief") descriptions should end with a period.
* The end of the doc comment and the start of the normal comment (with authors) are on separate lines. Avoid collapsing them to "<tt>*//*</tt>"


== config.h (Done by [[GigaClon]], patch posted 05-09-05) ==
=== Footer ===


Replace instances of
* Every source file should have the following Emacs local variable block and Vim modeline at the end:


  #include <config.h>
  /*
  Local Variables:
  mode:c++
  c-file-style:"stroustrup"
  c-file-offsets:((innamespace . 0)(inline-open . 0)(case-label . +))
  indent-tabs-mode:nil
  fill-column:99
  End:
*/
// vim: filetype=cpp:expandtab:shiftwidth=4:tabstop=8:softtabstop=4:fileencoding=utf-8:textwidth=99 :


with
=== Include Statement Style ===


* Include style. In-tree includes should use quotes, while system headers should use angle brackets. An exception is 2Geom, which should use angle brackets, though it is local (we are preparing for it to become a standalone library one day). Includes in each group should be sorted alphabetically. The path should be relative to the <tt>src/</tt> directory. If there is a config.h include, it should go at the top and have an <tt>#ifdef</tt> guard. Here is an example:


  #ifdef HAVE_CONFIG_H
  #ifdef HAVE_CONFIG_H
  # include "config.h"
  # include "config.h"
  #endif
  #endif
#include <cairo.h>
#include <cstdio>
#include <glib.h>
#include <math.h>
#include "display/cairo-utils.h"
#include "document.h"
#include "sp-use.h"
#include "xml/node.h"
== Order of the file ==
* Each file should contain the following, in '''precisely''' that order:
** When required for legacy needs, <tt>@file</tt> comment with a short description of the file's contents
** Copyright comment with authors
** Include guard (headers only)
** System includes
** Local includes
** Forward declarations
** Class declarations
** Function declarations
** Global variable declarations (note: global variables should be avoided)
** End of include guard (headers only)
** Emacs local variables block
** Vim modeline
== Documentation ==
=== Document At Point of Declaration ===
* Items should normally be documented at their point of declaration, not definition.
* For classes, namespaces, etc. this usually means to document in the .h file and not in the .cpp file.
** The .h file represents the public API, or promise of functionality.
** Doxygen comments have not been seen to be updated frequently enough to cause compilation issues from the .h files being touched. (Some developers had expressed concern that having comments in .h files would cause more frequent compilations)
** It is easier for developers to read through a .h file for information on a class than going through an entire .cpp or set of .cpp files. Additionally, most modern development IDEs support ease of browsing, referencing, etc.
* For local functions , declaring them at the beginning of the file they are local to and then implementing them later in the file allows a developer to quickly skim a summary of supported functionality. Pairing the doc comments with the initial declaration as opposed to the latter implementation helps legibility.
** Of course, functions local to a single file should no longer be declared <tt>static</tt>, but instead be declared inside an anonymous namespace section in the file.
=== Make Comments Meaningful ===
* Some documentation is useless, for example "constructor" or "destructor". Such comments mark the entity as documented, when in fact it's not. Remove them.
=== @brief Command ===
* The <tt>@brief</tt> command comes from the more complex documentation format implemented by Trolltech before Doxygen was created. When the <tt>@brief</tt> command is skipped, Doxygen will use the first sentence (ending with a dot) as the brief description. An alternative is to put the description in a single-line comment. These two techniques can be used to reduce the number of Doxygen commands. In the example below, all three functions will have the same documentation. The first case depends on the variable JAVADOC_AUTOBRIEF being set to true, which is a main setting for Inkscape documentation:


Note the "" instead of <>
/**
  * Something useful.
  * This function does something very useful.
  * Here is its more detailed, longer description.
  */
void useful_function_two();


== Booleans ==
/// Something useful
/**
  * This function does something very useful.
  * Here is its more detailed, longer description.
  */
void useful_function_one();


* Convert use of gboolean to bool where feasible
/**
* Switch from use of TRUE/FALSE to true/false
  * @brief Something useful.
  * This function does something very useful.
  * Here is its more detailed, longer description.
  */
void useful_function_three();


(does anybody know a good reason for this? Other than code niceness, can it give problems/bugs to have gboolean and bool?)
The use of <tt>@brief</tt> in Inkscape code comments is discouraged as redundant and overly verbose.


== Coding style ==


== More details needed ==
* Replace C-style casts with the appropriate C++ casts. You can compile with <tt>-Wold-style-casts</tt> to find them easily.
* Marking C-Style casts ([[JonCruz]])
** <tt>static_cast</tt> when the conversion is obvious, for example a floating point to integer type.
(BPF) See [http://www.ip97.com/gcc/C_002b_002b-Dialect-Options.html#index-Wold_002dstyle_002dcast-155 -Wold-style-cast]
** <tt>const_cast</tt> if the only difference between the types are <tt>const</tt> qualifiers.
** <tt>dynamic_cast</tt> for downcasting to derived class type. Note that this is not needed to upcast to a parent type.
** <tt>reinterpret_cast</tt> if the conversion does not compile with static_cast, for example pointer to integer.


* Fixing C-Style casts ([[JonCruz]])
Note that <tt>reinterpret_cast&lt;...&gt;(...)</tt> should be the cast of last resort.
(BPF) There could well be quite a lot of these. In some cases, this is because we are including C variables and concepts in C++ files, but there are many
cases which I could classify as a True Bill. Do we in fact want to remove all C-Style casts ...


* Fix all the gcc 3.4.2 warnings (inkblotter)
== Elimination of old utest tests ==
(BPF) As reported in [Patch 1223928 ] "Convert call(s) to gtk_widget_set_usize ...", all the warnings cam be easily fixed, save for one in canvas-arena.cpp where we apply a C macro to a C++ struct. It is to be hoped that all new code will be warning-free, but I am not sure whether we should be in a hurry to commit patches to fix old code as the benefit in doing so might be small.


* w32 version start being compiled with [[MinGW]] instead of MSVC
It should be double-checked that the old utest tests (also see [[TestSuite-blueprint]] are indeed all obsolete. If there happen to be any left that are not obsolete they should of course be converted to the CxxTest framework (if you don't feel up to it, ask me). Finally, the obsolete files should be removed from the repository and Makefiles, making sure that nothing breaks in the process.


[[Category:Developer Documentation]]
[[Category:Developer Documentation]]
[[Category:Help Wanted]]
[[Category:Help Wanted]]

Latest revision as of 10:05, 6 January 2012

Before embarking on any of those tasks, be sure to contact the developers on the mailing list.

Improving headers for compilation speed

When a header file changes, any file that includes it is recompiled. This becomes a problem when a given header file is included by other header files, as it leads to a vast header include tree, wherein a small change to a seemingly minor header file causes massive rebuilds of much of the codebase.

Fortunately for us, there are recognized techniques to mitigate this! Here's what to do:

  • Find a header that includes several other headers. For example, the header "bar.h" might have a class Bar with a member of type Foo*, thus it includes "foo.h" to get the definition of that type. You can prune this by using a 'forward declaration', whenever the Foo entity is not itself used, only pointers and/or references to it:
//#include "foo.h"  /* <-- kill the header include! */
class Foo;          /* <-- replace with a forward declaration */
class Bar {
    Foo* _foo;
};

Although there are still forward declaration headers in the source tree, those should be avoided in favor of individual forward declarations in .h files as needed. The exception is <2geom/forward.h>, due to heavy use of templates in the 2Geom library.

(BPF) The C++ FAQ Sheet explains how to code forward declarations (classes that both need to know about each other); To fully understand why, you should study the Pimpl idiom.

Source formatting

Header

  • Source files should use four spaces as indentation and no tabs. Trailing whitespace should be removed.
  • The comment at the top of each file should have the following format. The author information is in a regular multiline comment so that it is omitted in the generated documentation. Author emails can be obfuscated, but should be real addresses.
    • Using a Doxygen comments with @file at the top should no longer be the normal case. The comments need to document individual classes, subsystems, etc., and not be focused on file structure. Doxygen itself normally will address file-specific needs.
/*
 * Authors:
 *   J. God Hacker <ihatepizza@gurus.org>
 *   Ellen Epic <epicwin at email dot com>
 *
 * Copyright (C) 2006-2008 Authors
 * Released under GNU GPL, read the file 'COPYING' for more information
 */

Again, note that the comment does not start with "/**",. but only with "/*"

@file Command

  • Modern C++ code should avoid global and static variables, functions, enums, etc. Legacy code migrated from C may still contain these, so will require the use of a @file command.
  • Note that the @file command will only be required for files that have non-class non-namespaced globals or statics. As our codebase moves to the more modern C++ practices, use of these will be reduced and removed.
  • Legacy files that contain a mix of functions probably warrant use of a @file command.
  • If feasible, moving statics to anonymous namespaces instead is preferable to adding a @file command.
  • Any documented entities in namespaces, of local classes, etc., will be processed even if a @file command is not present in the source file.

An example of a legacy source file using the @file command:

/**
 * @file
 * Logarithmic time traveling salesman solver.
 */
/*
 * Authors:
 *   J. God Hacker <ihatepizza@gurus.org>
 *   Ellen Epic <epicwin at email dot com>
 *
 * Copyright (C) 2006-2008 Authors
 * Released under GNU GPL, read the file 'COPYING' for more information
 */

Note the following:

  • The opening doc comment is merely "/**" on a line by itself. Keeping the rest to subsequent lines aids legibility and revision tracking.
  • The @file command is on a line by itself, with nothing following. This is required to allow Doxygen to automatically extract the current filename.
  • The short description of the file contents (that follows starting on the line after @file) ends with a period. All short (aka "brief") descriptions should end with a period.
  • The end of the doc comment and the start of the normal comment (with authors) are on separate lines. Avoid collapsing them to "*//*"

Footer

  • Every source file should have the following Emacs local variable block and Vim modeline at the end:
/*
  Local Variables:
  mode:c++
  c-file-style:"stroustrup"
  c-file-offsets:((innamespace . 0)(inline-open . 0)(case-label . +))
  indent-tabs-mode:nil
  fill-column:99
  End:
*/
// vim: filetype=cpp:expandtab:shiftwidth=4:tabstop=8:softtabstop=4:fileencoding=utf-8:textwidth=99 :

Include Statement Style

  • Include style. In-tree includes should use quotes, while system headers should use angle brackets. An exception is 2Geom, which should use angle brackets, though it is local (we are preparing for it to become a standalone library one day). Includes in each group should be sorted alphabetically. The path should be relative to the src/ directory. If there is a config.h include, it should go at the top and have an #ifdef guard. Here is an example:
#ifdef HAVE_CONFIG_H
# include "config.h"
#endif

#include <cairo.h>
#include <cstdio>
#include <glib.h>
#include <math.h>

#include "display/cairo-utils.h"
#include "document.h"
#include "sp-use.h"
#include "xml/node.h"

Order of the file

  • Each file should contain the following, in precisely that order:
    • When required for legacy needs, @file comment with a short description of the file's contents
    • Copyright comment with authors
    • Include guard (headers only)
    • System includes
    • Local includes
    • Forward declarations
    • Class declarations
    • Function declarations
    • Global variable declarations (note: global variables should be avoided)
    • End of include guard (headers only)
    • Emacs local variables block
    • Vim modeline

Documentation

Document At Point of Declaration

  • Items should normally be documented at their point of declaration, not definition.
  • For classes, namespaces, etc. this usually means to document in the .h file and not in the .cpp file.
    • The .h file represents the public API, or promise of functionality.
    • Doxygen comments have not been seen to be updated frequently enough to cause compilation issues from the .h files being touched. (Some developers had expressed concern that having comments in .h files would cause more frequent compilations)
    • It is easier for developers to read through a .h file for information on a class than going through an entire .cpp or set of .cpp files. Additionally, most modern development IDEs support ease of browsing, referencing, etc.
  • For local functions , declaring them at the beginning of the file they are local to and then implementing them later in the file allows a developer to quickly skim a summary of supported functionality. Pairing the doc comments with the initial declaration as opposed to the latter implementation helps legibility.
    • Of course, functions local to a single file should no longer be declared static, but instead be declared inside an anonymous namespace section in the file.

Make Comments Meaningful

  • Some documentation is useless, for example "constructor" or "destructor". Such comments mark the entity as documented, when in fact it's not. Remove them.

@brief Command

  • The @brief command comes from the more complex documentation format implemented by Trolltech before Doxygen was created. When the @brief command is skipped, Doxygen will use the first sentence (ending with a dot) as the brief description. An alternative is to put the description in a single-line comment. These two techniques can be used to reduce the number of Doxygen commands. In the example below, all three functions will have the same documentation. The first case depends on the variable JAVADOC_AUTOBRIEF being set to true, which is a main setting for Inkscape documentation:
/**
 * Something useful.
 * This function does something very useful.
 * Here is its more detailed, longer description.
 */
void useful_function_two();
/// Something useful
/**
 * This function does something very useful.
 * Here is its more detailed, longer description.
 */
void useful_function_one();
/**
 * @brief Something useful.
 * This function does something very useful.
 * Here is its more detailed, longer description.
 */
void useful_function_three();

The use of @brief in Inkscape code comments is discouraged as redundant and overly verbose.

Coding style

  • Replace C-style casts with the appropriate C++ casts. You can compile with -Wold-style-casts to find them easily.
    • static_cast when the conversion is obvious, for example a floating point to integer type.
    • const_cast if the only difference between the types are const qualifiers.
    • dynamic_cast for downcasting to derived class type. Note that this is not needed to upcast to a parent type.
    • reinterpret_cast if the conversion does not compile with static_cast, for example pointer to integer.

Note that reinterpret_cast<...>(...) should be the cast of last resort.

Elimination of old utest tests

It should be double-checked that the old utest tests (also see TestSuite-blueprint are indeed all obsolete. If there happen to be any left that are not obsolete they should of course be converted to the CxxTest framework (if you don't feel up to it, ask me). Finally, the obsolete files should be removed from the repository and Makefiles, making sure that nothing breaks in the process.