Re: [Inkscape-devel] Imported PNG file not displayed.

25 Sep 2009

On 09/24/2009 08:52 AM, Jasper van de Gronde wrote:
...
Diederik van Lierop wrote:
...
... For now I've fixed my patch (see the attachement) by limiting it 
to the non-MMX c++ code. This fixes the bitmap pixel alignment issue 
for some users, but not for everyone. There must be someone reading 
this who can port this to the assembly code too. It really is only a 
few lines of code! Until then this bug cannot be completely solved, 
.... unless we start using Cairo (?)
I would recommend timing to see whether the MMX code is (still) really 
much faster than the normal C++ code. If it's not significantly faster 
than the old 32bit C++ code I would go with the higher quality code.
As for porting the code to assembly, if it is indeed making use of MMX 
I have doubts about the feasibility/usefulness of porting the code, as 
the assembly would likely not be as efficient for 64bit integers as it 
would be for 32bit integers (assuming the right instructions do 
exist). But of course this would have to be tested.
Yes, valid points indeed! So I prepared a patch to do some measurements 
and got some interesting results: when running Fedora 10 i386 
virtualized on a Fedora 10 x86_64 host, the non-mmx code actually 
appears to be faster by 10 - 15%! But before we decide to kick out some 
of the mmx optimalizations (only the ones that are being tested here), 
we would need more measurements because my setup might not be 
representative. So to all the bug reporters who had their images 
disappearing due to my previous patch: I know you're running with the 
mmx optimized code ;-). Please test the attached patch against rev. 
22287 and report back the measurement data. This is what you should get 
on the console:
...
avg. time spent rendering; with mmx -> 4.10622 msec | without mmx -> 
3.5991 msec
avg. time spent rendering; with mmx -> 4.10563 msec | without mmx -> 
3.59859 msec
avg. time spent rendering; with mmx -> 4.10459 msec | without mmx -> 
3.59769 msec
Only the last line is relevant.
PS: Don't use this patch for production work, because everything will be 
rendered twice (with and without mmx optimalisations). This will slow 
down the rendering. I also tried running the non-mmx optimized code 
before the other to eliminate any caching effects, but the results look 
similar
If we can get rid of this specific mmx optimized assembly code, then we 
can solve the misalignment in rendering the bitmaps for everyone!
Kind regards,
Diederik
Index: src/libnr/nr-compose-transform.cpp
===================================================================

--- src/libnr/nr-compose-transform.cpp	(revision 22287)
+++ src/libnr/nr-compose-transform.cpp	(working copy)
@@ -345,14 +345,37 @@
    	}
#ifdef WITH_MMX
-		if (NR_PIXOPS_MMX) {
-			/* WARNING: MMX composer REQUIRES w > 0 and h > 0 */
-			nr_mmx_R8G8B8A8_P_R8G8B8A8_P_R8G8B8A8_N_TRANSFORM_n (px, w, h, rs, spx, sw, sh, srs, FFd2s, FF_S, alpha, dbits);
-			return;
-		}
+	GTimeVal t1, t2, t3;
+       	static double total_mmx = 0;
+       	static double total_no_mmx = 0;
+	static double n = 0;    
+
+        if (NR_PIXOPS_MMX) {            
+            g_get_current_time(&t1);
+
+            /* WARNING: MMX composer REQUIRES w > 0 and h > 0 */
+            nr_mmx_R8G8B8A8_P_R8G8B8A8_P_R8G8B8A8_N_TRANSFORM_n (px, w, h, rs, spx, sw, sh, srs, FFd2s, FF_S, alpha, dbits);
+
+            g_get_current_time(&t2);
+        }
 #endif
-		nr_R8G8B8A8_P_R8G8B8A8_P_R8G8B8A8_N_TRANSFORM_n (px, w, h, rs, spx, sw, sh, srs, FFd2s, FF_S, alpha, dbits);
-	}
+        nr_R8G8B8A8_P_R8G8B8A8_P_R8G8B8A8_N_TRANSFORM_n (px, w, h, rs, spx, sw, sh, srs, FFd2s, FF_S, alpha, dbits);
+#ifdef WITH_MMX
+        if (NR_PIXOPS_MMX) {
+            g_get_current_time(&t3);
+            double dt12 = ((t2.tv_sec - t1.tv_sec) * G_USEC_PER_SEC + (t2.tv_usec - t1.tv_usec)) / 1000.0;
+            double dt23 = ((t3.tv_sec - t2.tv_sec) * G_USEC_PER_SEC + (t3.tv_usec - t2.tv_usec)) / 1000.0;
+            total_mmx += dt12;
+            total_no_mmx += dt23;
+            n++;
+	std::cout << "avg. time spent rendering; with mmx -> " << total_mmx/n << " msec | without mmx -> " << total_no_mmx/n << " msec" << std::endl;
+        } else {
+            std::cout << "sorry, no NR_PIXOPS_MMX" << std::endl;
+        }
+#else
+        std::cout << "sorry, no WITH_MMX" << std::endl;
+#endif
+    }
 }
void nr_R8G8B8A8_P_R8G8B8A8_P_R8G8B8A8_P_TRANSFORM (unsigned char *px, int w, int h, int rs,