
NO! There is a definite and extremely important difference between new and fail. Perceptualdiff only helps with images that really are almost exactly the same. It does NOT help when: - A Bezier curve is slightly (but benignly) perturbed (which has happened) or other small but for a human insignificant (as far as the correctness of the render is concerned) changes occur (for example changes in resampling of bitmaps). - A complicated test case was judged as 'pass' incorrectly. - A 'new' result is actually a 'pass' (for example when there is no pass reference yet). - Something changes that is unrelated to the specific test. For example, there are a few filter tests that register as passes because Inkscape indeed implements them correctly but that still do not render correctly because Inkscape doesn't implement the color-interpolation properties. (It would be better to change the tests of course, but still, stuff like this happens.)
In short, perceptualdiff is nowhere near a true substitute for a human judge. It is great for filtering out spurious results based on minute numerical differences and/or differences in the binary encoding of the pngs, but that's about it.
For this reason the system was set up specifically to allow for multiple pass/fail references and flag anything it can't match as a new result. In the past I could easily keep up with judging any new results because don't occur very frequently, but recently a lot of tests suddenly had new results (probably because of changes in bitmap rendering) and since I was/am way too busy I was unable to rejudge them myself. At the time I sent a mail about this, but no one responded.
So: PLEASE just judge these new results once (probably won't take too long) and then the results will be like normal again.
P.S. The system is set up so that if there are two (or more) results in one day it only displays the last, that's why hardly any new results show up in the history of the results (I'd run the tests, rejudge any new results, if any, and then rerun the tests).
J.B.C.Engelen@...1578... wrote:
Hi all,
Could someone have a look at the testsuite and reprogram it such that (at least when perceptualdiff is used) the results that are marked 'new' are marked 'fail' instead? This is much clearer, and cleans up the daily checks at http://auriga.mine.nu/inkscape/. Thank you very very much!
I love the testsuite, but I don't want to delve in to the code and change things myself.
Thanks a bunch, Johan