Quality Assurance Initiative

From Inkscape Wiki
Jump to navigation Jump to search

This initiative aims to improve the quality and stability of Inkscape and to reduce the number of bugs and regressions, especially severe ones.

In a nutshell, the goal is to have more of:

  • Test coverage of source code (unit tests, component/integration tests, end-to-end tests);
  • Test automation, developer-friendly testing frameworks and tooling;
  • Good practices related to coding and testing;
  • Improvements to the code review process and release management;
  • Focus on quality, especially during the release period.

In this way, we would like to have less of:

  • Crashes and data loss bugs;
  • Reliance on time-consuming manual testing;
  • Memory leaks and corruption;
  • Glitches, corner cases and difficult-to-reproduce bugs;
  • Code fragility due to hidden assumptions and limitations;
  • Legacy code;
  • Code that nobody understands and maintains.

Proposed Actions (To Do List)

  1. Set up CMake targets and macros to make it easy to create unit tests.
    Currently, all GTest tests link libinkscape_base, so they are in fact integration tests rather than unit tests. There should be an easy way of explicitly listing one or several cpp files which will be compiled and linked with tests support utilities only (test the functionality in isolation). Additionally, there needs to be a way to request linking in lib2geom for functionality that requires it.
  2. Ensure all tests run in debug mode so that failed asserts crash them.
  3. Create a flag to easily run tests with the sanitizer.
    It should be easy to choose whether a given unit or integration test should run with or without libasan. The long-term goal is to transition all tests to run in sanitized mode, but many would currently fail due to memory leaks.
  4. Create a setup to link inkscape with libasan in the CI pipeline for use in end-to-end tests.
    After building object files, link two inkscape executables: one with and one without libasan. Create a flag to choose which end-to-end test (e.g., CLI test) should run with the sanitizer (aspirationally, all of them, but not long ago even running inkscape --help leaked memory).
  5. Migrate some of the existing tests to unit test framework.
    Do not link all of inkscape just to test a single class or collection of functions.
  6. Devise a test strategy for the interactions between SPObjects and their XML representations.
    For each SPObject type, it should be possible to easily write tests that create such object (in isolation or in a test fixture simulating a minimalistic document) from an XML string representation and, likewise, write an XML representation of the object. Unit tests of this kind would characterize and document the properties of XML representations and, with sufficient coverage, can prevent data loss bugs and many undo-related issues.
  7. Create fixtures for testing SPObject behavior.
    It should be possible to create an SPObject, perhaps embedding it in a minimalistic document provided by a suitable fixture, and perform specific operations on this object in order to create unit tests for all the different SPObjects.
  8. Write integration tests for document-level interactions.
    The goal is to test scenarios where several objects depend on each other: clones, groups, gradients, LPE objects, markers which link to each other by href or another type of reference. Integration tests of this kind (linking the entire libinkscape_base into the test executable) will ensure robust handling of situations such as deletion of dependencies, clipboard operations, undo. When run with the sanitizer, they may catch rare memory corruption bugs.
  9. Refactor the LPE system.
    The LPE system dynamically rewrites the d attribute of an SVG path in order to create an artistic effect. As such, each LPE should be easily testable in principle. Migrate existing end-to-end tests of the LPE system and make them unit tests. Assumption: making LPEs individually testable will improve the quality of LPE-related code and offer an opportunity to redesign it.
  10. Enable test coverage reports in the CI pipeline.
    This was already done in lib2geom and the extensions repository, so a similar setup should be easy to reproduce. Consider gamification of test coverage.
  11. Create more rendering tests.
    The display system should be tested more rigorously in order to prevent rendering regressions.
  12. Automated action-based tests.
    We could write more tests that execute a sequence of actions, either on a fresh document or on a pre-existing one. Although this could be done with end-to-end CLI-based tests, it is probably easier to have a C++ test fixture with actions called manually or via their string names.
  13. Examine the possibility of creating event-injection GUI-tests.
    If successful, this could save a lot of time currently spent on manual GUI testing.
  14. Enhance contribution and code review guidelines with testing-related information
    A careful balance has to be struck beween the imperative of enforcing a sufficient test coverage and allowing for contributions from relatively inexperienced community members who may lack the knowledge of good practices related to testing. Guidelines which help educate contributors, in addition to learning from examples (for instance, tests created as a part of items listed above) will ensure that even new contributors can contribute to the QA goals within Inkscape. We recognize that these goals should not entail more burden on the contributors, and that with a sufficient level of CI automation, the quality will happen "by default" and "by design".

Automated testing

The Test Pyramid illustrates the relative importance and the desired prevalence of tests at different coverage levels.

Inkscape's CI pipeline contains a number of integration and end-to-end tests. However, we recognise that the overal test coverage is very poor as of late 2023. One of the main goals of this initiative is to track the test coverage and work on gradually increasing it.

Although some existing tests can be classified as "unit tests" because they focus on testing a single unit of code at the time, the external dependencies are poorly understood because all tests link all of libinkscape_base. This should be addressed as a part of point 1 above.

In the context of automated tests, we recall the famous Testing Pyramid (see right). It expresses the principle that tests at unit and component level should be the most prevalent ones, followed by a smaller number of integration tests, and topped off with a relatively minor number of end-to-end tests. Unfortunately, this is not the case today: we have too few unit tests. Manual testing should be limited to UX and usability aspects.

Why does Inkscape not have enough unit tests

  • Historical reasons: feature development was prioritized and testing was never formally enforced.
  • Work on tests and QA topics may be perceived as less newsworthy because it is less visible to the end user.
  • Many new developments interact with and depend on legacy code which is hard to split off and mock. Thus, new code is hard to test.
  • Insufficient tooling such as automated test coverage reports or custom test fixtures to facilitate test development.
  • Varied knowledge level of contributors, some lacking expertise in testing.

What can be done about it

  • Code review should emphasize the importance of adding unit tests for new functionality and regression tests for bug fixes.
  • Quality and maintainability should be viewed as core aspects and as an important part of Inkscape's development.
  • Legacy code must be managed carefully, see below.
  • Test coverage reports should be introduced along with facilities that make it easy and pleasant to write tests.
  • By increasing the number of tests, we can increase the knowledge and awareness of testing techniques in the community – learning from examples.

Dealing with legacy code

According to Michael Feathers, the author of the influential book "Working effectively with legacy code", legacy code is not defined as "old code", "code written in C" or in general "code that nobody maintains". Instead, legacy code is code that is not testable and not tested. We recognise that Inkscape contains a lot of such code.

The Vicious Cycle of Legacy Code

The fundamental problem with legacy code is that it's very risky to change it because a small change in one place can have unpredictable and unintended consequences in other places. This has happened many times in Inkscape's history and has led to severe bugs. In an ideal world, the existing code could be first enriched with "characterization tests" that capture its current behaviour, and tests could also be added for the new behaviour. In this way, we could assure that the development works as intended and doesn't break anything. But, according to our definition, legacy code is not testable so this cannot be done. We don't live in an ideal world!

Thus, we are stuck in a vicious cycle: to make code testable, we need to refactor it in order to make it testable, but refactoring can break things badly unless we have tests, which unfortunately we don't and which cannot be added because the code is not testable.

To break out of this cycle, we need to start small, adding tests to simple and relatively independent pieces of functionality. It is helpful to identify the "seams", which usually correspond to interfaces, along which the program or a system can be natuarally divided into functional pieces. If these pieces are simple enough, they can be made testable. However, even with this approach, a fair bit of mocking may be required to simulate external dependencies, which makes it a tedious task.

Refactoring and development techniques to increase testability

A large portion of Inkscape's codebase is made up of code that's very literal, low-level and suffers from a lack of abstraction. Oftentimes, such code was developed as a part of a small contributions without architectural goals, focused only on adding a small missing piece of functionality. As contributors come and go, this leaves behind a patchwork of fairly inconsistent solutions. Nonetheless, many parts of Inkscape have seen some notable refactoring efforts aimed at unifying these patchwork solutions. It is however important that future refactoring efforts take testability considerations into account.

Desigining good interfaces and abstractions is an important goal that should be kept in mind not only during major architectural updates, but also our everyday work. Whenever you work on Inkscape, ask yourself the question: "Does this change make the code I'm working on more or less testable?". Below are some concrete examples of techniques that can help increase testability.

Extract and mock/override

  • Identify the piece of code that makes the functionality non-testable.
  • Move this piece of code to a separate function(s) or a class. Moving code is safer than rewriting it, because the functional logic is kept unchanged.
  • In tests, mock the class or function that the problematic code was moved to. This lets you test new functionality independently of the legacy code.
  • If you need to extend the functionality provided by the non-testable code, try doing it without touching the untested function, but rather wrap around it or provide an override.

Decorator pattern

This design pattern helps extend a legacy class with new functionality. Instead of adding the new functionality directly to the non-testable class (which would make the new functionality non-testable as well), write a wrapper around the legacy class. In the decorator pattern, the decorator class holds an instance of the wrapped class and provides an interface that forwards member function calls to the wrapped object. However, some of the member functions of the wrapper can provide additional functionality or further work on the values obtained from the wrapped object, which makes it possible to add new behaviours. This new functionality can now be tested by mocking the wrapped class in tests.

Façade transformation

This technique helps when dealing with huge classes that hold too much data and that have too many member functions, making them complex, difficult to test, and increasing the coupling with other parts of the code. The façade transformation replaces the large class with a façade class which internally holds instances of several smaller classes. These smaller classes need to be designed by identifying the functionally cohesive pieces of the original large class and splitting them up. The member functions of the façade delegate to member function calls on these internal instances, sometimes more than one at a time. This transformation can be repeated if needed, until the classes are small enough that testing them is possible. The advantage of the façade transformation is that the API exposed by the large class is unchanged.

Guidelines on writing unit tests

  • Create a separate test executable for each class or functional module.
  • Write a separate test case for each of the tested behaviours.
  • Write a doxygen-style comment above each test case, explaining very briefly what this case is testing.
  • In each test case, follow the Arrange-Act-Assert principle:
    • Arrange: create and prepare the data and objects needed to perform the tested operation; set up expectations for mock calls;
    • Act: run the tested operation;
    • Assert: use EXPECT_* and ASSERT_* macros from Google Testing library to check that the behaviour is as expected.
  • Remember to test strange and corner cases, invalid input, fallback behaviours.
  • If your test needs custom set-up and tear-down, use a custom test fixture and TEST_F macros.
  • If your test needs to run on several cases parametrized by input data, consider using a TEST_P fixture.

Guidelines on writing end-to-end tests

If you are about to create an end-to-end test, ask yourself first if perhaps a unit test can be written instead. However, in the case of regression tests, it's often good to have end-to-end tests that clearly demonstrate that a bug has been fixed and ensure it is not accidentaly reintroduced. End-to-end tests are also suitable for complex cases involving import and export operations. In many cases (such as display and rendering, utilities, Live Path Effects, SPObject system, anything in lib2geom), unit tests ought to be preferred.

  • If you create test files (SVG documents or files in other formats), ensure that the content of the files can be visually inspected.
  • If your test file contains both test objects and reference objects, ensure that the relationship between them can be directly inspected visually by a human being in order to confirm that the tested behaviour is indeed correct. You may use existing fixtures such as Inkscape::TestWithSvgObjectPairs for this purpose.
  • If you use separete reference files, ensure they can be visually inspected as well.
  • Try to make your test focused on the tested behaviour by reducing the number of unrelated elements in your test files.
  • Try to make your test robust and independent on implementation details unrelated to the tested behaviour.