Skip to content

Using VTK's Image Regression Tests in Avogadro 2

One of the really nice features of VTK's testing framework is the use of image-based regression tests. These allow developers to write tests that result in a final image, which can be recorded and compared to known baseline images in order to verify that the OpenGL rendering code is rendering the same (or similar) image on all platforms. If this fails then CDash will display the image the test produced, the baseline image it was compared to and an image difference. Any project that performs rendering or visualization needs tests like these in addition to unit tests if they want to assure visualization code continues to function as expected across a range of platforms.

We recently extracted the relevant code from the VTK testing framework to perform image based regressions in Avogadro 2, with the bulk of that code living in utilities/vtktesting/imageregressiontest.h. This is currently used in one of the tests, with plans to extend it to cover all major types of rendering, this can in seen in action intests/qtopengl/glwidgettest.cpp with the important lines that take the snapshot/do the image comparison being:

  // Grab the frame buffer of the GLWidget and save it to a QImage.
  QImage image = widget.grabFrameBuffer(false);
  // Set up the image regression test.
  Avogadro::VtkTesting::ImageRegressionTest test(argc, argv);
  // Do the image threshold test, printing output to the std::cout.
  return test.imageThresholdTest(image, std::cout);

The CMake code that feeds in the command line arguments, and ensures the test runs correctly is in tests/qtopengl/CMakeLists.txt, and largely involves passing in paths to the baseline directory, a temporary directory and the test name (using the standard CMake generated test driver).

  add_test(NAME "QtOpenGL-${test}"
    COMMAND
      AvogadroQtOpenGLTests "${testname}test"
      "--baseline" "${AVOGADRO_DATA_ROOT}/baselines/avogadro/qtopengl"
      "--temporary" "${PROJECT_BINARY_DIR}/Testing/Temporary")
Valid baseline image

The above is the baseline image, that is stored in a known location and compared with the image produced by the test (shown below).

Test image image produced

If the images don't match a difference image is produced and uploaded (shown below). In this case you can see that an extra sphere was rendered, and this can clearly be seen in the difference image. There is also a numerical difference returned by the test, which is a measure of how much the images differ. The tolerance can be tweaked depending on the test to allow some minor pixel differences, although care must be taken not to raise the number too high.

Image difference from test to valid

We have not implemented it in Avogadro 2 yet, but VTK can use multiple baselines and returns the smallest image difference. This allows for OS/GPU specific baselines to be uploaded where necessary as an alternative to increasing the tolerance. Using special tags returned by the tests in the standard output will prompt the ctest command to upload the image files when necessary (in the case the baseline image cannot be found, or the image comparison fails).

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

dhaumann on :

dhaumannIs this the way to go for graphical unit testing? I'll soon need something similar:
1. create an image from tex (baseline 1)
2. create an image from my gui (baseline 2)
3. create current image from gui (what changed?)
and then compare these.

Are there other image unit testing frameworks?

Marcus D. Hanwell on :

Marcus D. HanwellThere is reftest Benoit mentioned in the comment below, I was not aware of that until now. The code I pointed to is some pretty simple C++ code you could use, but it would add a VTK dependency to your testing framework. Things we must do is ensure the images being compared are all the same size, but you could certainly compare many images (just look at the image comparison pipeline I set up in the VTK testing header linked to).

Benoit Jacob on :

Benoit JacobFWIW, what we do at Mozilla for rendering regression tests, called 'reftest', allows us not just to compare rendering to a reference image, but to verify that any two different renderings are identical. For example, you may want to verify that the rendering of a molecule does not depend on the ordering of the atoms. Not having to store reference renderings is a plus, because it means that you can subsequently make changes that affect all rendering without having to re-generate all your reference images. It also avoids false positive caused by details of GL rendering on a particular system.

Marcus D. Hanwell on :

Marcus D. HanwellComparing different renderings might be useful, although that rarely comes up in VTK. It would be pretty easy to adapt this approach to compare one test image with that generated by another test, and I can see how that is very useful in browsers. It would be good to take a look at reftest, the measure of difference and fuzziness (tolerance) helps us with details of GL rendering on different systems (coupled with multiple baselines). Most of the time the focus is on establishing good baselines, which must also be true in Mozilla projects (I imagine at least).

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
To leave a comment you must approve it via e-mail, which will be sent to your address after submission.
Gravatar, Identica, Twitter, MyBlogLog, Pavatar, Favatar author images supported.
You can use [geshi lang=lang_name [,ln={y|n}]][/geshi] tags to embed source code snippets.
Form options