can we render vector graphics using the GPU to process everything in mass parallel.

I got told off for this monologue, so I hope it's a good one...
I didn't manage to draw a conclusion but I did come up with a way of testing if the vector drawing calculations can be offloaded to the GPU which would in theory make them blazingly fast. It would also be of interest to other open-source graphics projects, so I may get Intouch with the Krita guys https://krita.org/en/download/krita-desktop/ as I know they have some GPU optimizations and see what mileage they got out of it.
At the very least SIMD on the CPU is capable of performing those four multiplications required for a spline in one cycle, it's just a case of what performance gains can be made rendering the other embroilment beyond basic line drawing. ---------------------------
I've got my Béziers down to this... a * p[0] + b * p[2] + c * p[4] + d * p[6] etc.. The points are fixed for the segments and a,b,c,d are a function of the fraction t. that should vectorize nicely and do all four mul operations in one cycle. The number of fractions in t is really dependent on the length of the curve.. Which makes the performance bottleneck populating the arrays to do the vectorized calculations. It's always possible to have arrays with prefill numbers of fractional steps. That would prevent the over head of the CPU having to go out of catch to fetch data in an informally populated memory block... but just having a pre-allocated array to stick all the fractions in should overcome that. So, the only question remaining, assuming fractioned aren't precalculated. is what is the performance difference between populating a preallocated array of fractions and then doing one massive vector operation on the lot in batch, vs calculating the fractions on the fly and just doing a little vector operation. It should be possible to hold all the calculations of the fractions in the loop inside the registers, which should make on the fly faster than populating a vector as there're no worries about the CPU having to fetch data from odd places. GCC is supposed to auto vectorize but I think I'll have a crack at using the extensions to see what difference there is between GCC's auto vectorization and optimal irker203 12:52 AM [inkscape] Marc Jeanmougin pushed new commit to master: https://gitlab.com/inkscape/inkscape/-/compare/69dc1688...d2b75904
gitlab.com 69dc1688...d2b75904 · Inkscape / inkscape Inkscape vector image editor
oliverthered 12:52 AM Looks like OpenCL should give the best bulk operation solution. I think it's only 32bit precision, that just means quadrupling up the data and scaling the fractional component into nice 32bit chunks. a standard loop would be quicker than calculating all the points in bulk into memory, but that's assuming you only want to draw a pixel line with no other operations performed upon it. if you have all the points of the line calculated and held in memory on the GPU you can then perform a load of other operations upon them in bulk like anti-aliasing. It is a shame the GPU can't just take the values stored in what is effectively a texture used for bulk parallel processing and turn them into pixel co-ordinates to draw on another texture or it would be possible to calculate and draw all the vector points on the GPU. If you calculate many lines all simultaneously on the GPU the overhead of transferring memory over the bus should be well mitigated against.
``` Anyhow, that was some interesting research. When i get around to taking a look at Inkscape code I should be able to frame some idea of just how much of the vector drawing operations can be calculated in bulk on the GPU. Certainly, all the operations requiring tangents and normals should be achievable given an array of line fractions and the spline control points. SVG drawing is complex so there's a lot of overhead in the actual drawing process compared to line drawing so holding an array of points to be drawn in memory may actually make performance sense.
participants (1)
-
NASA Jeff