On 2013-10-16 07:29, Tavmjong Bah wrote:
On Wed, 2013-10-16 at 02:48 +0200, Guiu Rocafort wrote:
Hi ! I've been making some progress in the automated testing code. I've simplified much and tried to keep it simple. ( http://bazaar.launchpad.net/~neandertalspeople/+junk/inkscape-testsuite/file... )
Very nice! You may have simplified a little too much though, although it does depend a bit on what the goal with this is. In particular, you seem to have removed any support for patch files (which are insanely useful for testing certain things in isolation, and to avoid problems due to different renderings on different systems, for example when using fonts). I would also recommend quoting command line arguments, or you risk running into trouble the first time someone puts a space in a filename for example. The rest of the simplifications you can probably get away with, although I would indeed test what happens when Inkscape crashes (that was one of the reasons for tester.cpp, but is primarily a problem under Windows, so if Linux is the only target, then it might indeed just work).
Trunk uses a new renderer based on Cairo. It is not surprising that the images from 0.48 don't match trunk on a pixel-by-pixel basis. (BTW, the SVG spec says that there is a one pixel tolerance in rendering SVGs.) Since automated testing is mostly for checking for regressions, I would simply make the reference images using trunk (comparing them with the PNGs from W3C to determine pass/fail).
I fully agree, make life easy on yourself. You may even consider using just one reference (like you appear to be doing now), then for now you don't have to sort out whether or not tests pass or fail (assuming the automated test system can live with that). In that case I would not use the names "fail" and "pass" though, as they are a bit misleading. Maybe "same" and "changed" or something?
So I've started to notice that this is going to be more difficult than I initially thought. I might try to use the perceptualdiff to try to differentiate if the changes are small enough to consider the test pass. A more developed idea would consist in measuring the "density of changes" in pixels in areas of the image.
Any thoughts ? Tavmjong ? jasper ?
The W3C reference images are (if I remember correctly, and this may have changed with the new test suite) essentially unsuitable for automated testing with Inkscape. There are simply too many small differences between Inkscape and what they used (Batik I think) to make a direct comparison. The one pixel off rule might help (if we implemented something for using that), but since Inkscape is systematically half a pixel off compared to Batik... Inkscape considers a pixel to be at (x+0.5,y+0.5) while Batik considers it to be at (x,y), if I'm not mistaken (just compare the black border around the tests to see what I mean). Also, suddenly rendering a whole object by one pixel off would typically be considered a bug, so anything that uses the "one pixel off" rule would probably have to be a bit more intelligent than just allowing things to be one pixel off.
Long story short: just use the current output from Inkscape as a reference, and (optionally) divide the tests into pass and fail categories. (It would be much nicer to have the division, as it would allow us to keep track of where we are in terms of compliance, but don't let it get in the way of having usable automated testing.)