Search - User list
Full Version: I'm curious. Why is rasterizegeo so slow?
Root » Copernicus » I'm curious. Why is rasterizegeo so slow?
Gaalvk
I'm trying to solve some animation problems that can't be solved (or I don't understand how) in OpenCL, such as fragment overlays, which are impossible for the inverse transformation but easy for the forward transformation. So I do it through SOP and rasterize, but it's incredibly slow in animation. I keep thinking about this—the viewport does exactly the same thing as rasterize: it sends rays from the camera at each pixel and generates a 2D image. Moreover, the viewport's resolution is much higher, and it calculates color, shading, and a lot more, all at 60-100 fps. But rasterizegeo for the same geometry will drop me to 5-10 fps if there are a lot of polygons. The difference is huge, but why? The essence is the same.
It would be funny to stream Vulcan's viewport directly to the COP.
jlait
We are using Vulkan directly in Rasterize Geo. So it should be using the same pipe as the viewport.

One issue is that we may have to move the geometry to the GPU differently?

What platform are you on? In some cases we have to revert to software for rasterize geo despite the viewport using hardware :<

Capybara + Joint Deform -> rasterize geo is fast for me, for example....
Gaalvk
Windows, 4070ti.
For example, here are the flying fragments, 1000 pieces in total. In SOP, it's 63 fps, but when we insert it into Rasterize Geo, it's 4-5 fps. Even though the layer is 1024*1024 versus a 2K window in SOP. I checked and set the layer size to 2048*2048, and the same Rasterize Geo showed less than 2 fps. It is clear that rasterizing a float attribute is much faster than float3, such as color. The difference with SOP is approximately two orders of magnitude, not percentages. This is surprising. Copernicus itself is quite fast, which is good. But Rasterize Geo ruins the speed if anything is animated.
jlait
To make sure we are on the same page, does the attached show the disparity for you? Or please provide a .hip that shows the disparity.

I currently know of two possible failure modes. If we fall back to software vulkan, it definitely is game over. But we can also have the vulkan device and OpenCL device not be on same system, which disables interop.
jlait
It was just pointed out to me that we have interop disabled in H21 because of some driver leaks we weren't able to address in time. Living in the future, as I am, I already had the fast interop.

You can try forcing interop to run anyways with:
HOUDINI_COP_RASTERIZE_INTEROP
You have to set this environment variable to 1 and see if it unlocks faster speed. It probably will cause a VRAM leak that will result in unhappiness, however. But it would be good to know if it makes it fast so we know that when this is fixed in future versions you will get a faster rasterize....
Gaalvk
Yes! Everything is exactly as you described!
Your file shows around 60 fps in SOP and 7-8 fps after rasterizing. But after enabling the variable and restarting Houdini, rasterizing also shows 60 fps. This is brilliant! I'm so glad you have a solution and can't wait for the new version; it's a real game-changer! Now we can do anything in SOP. Great, guys!
P.s. Yes, with a variable of 1 the video memory fills up at a high speed, that's a fact...
jlait
Thank you for testing! Glad this was the source of the problem. IIRC, the fix to our interop woes was not trivial to backport, which is why we chose just to leave it disabled in the current version.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Powered by DjangoBB