Hi Everyone!
I'm new to Houdini but have about 4 years experience with Cinema 4D/Redshift. In a new scene I created an MPM Configure node and started the sim. Before caching the sim runs at about 20 frames-per-MINUTE! And, I see very little activity on the GPU. Is there something I need to configure or would that frame-rate be considered normal?
I'm on MacOS, Intel Xeon 16-core, and AMD Radeon W6800X. This GPU does support OpenCL 1.2, which ,I believe, is correct for MPM. GPU has 32 GB of VRAM. And the system has 224 GB or RAM.
Any ideas or suggestions would be greatly appreciated!
Thanks,
Greg
Simple MPM Configure at 20 Frames-per-MINUTE
6207 21 2-
- GregBollella
- Member
- 60 posts
- Joined: June 2021
- Offline
-
- BabaJ
- Member
- 2162 posts
- Joined: Sept. 2015
- Offline
-
- GregBollella
- Member
- 60 posts
- Joined: June 2021
- Offline
-
- jsmack
- Member
- 8177 posts
- Joined: Sept. 2011
- Offline
GregBollellaBabaJ
Don't have any advice.
But I tried it on win8 with i7 16GB Ram with a 980GTX...didn't check how much that card used though,
It took 4 seconds(for 20 frames).
Wow, that's fast. Hopefully, I just need to tweak some settings or something simple!
Greg
The first time you run it, it has to compile the OpenCL kernels. This took about a minute for me, but after compiling it's very fast. On my system 100 frames takes 10 seconds or so.
-
- GregBollella
- Member
- 60 posts
- Joined: June 2021
- Offline
jsmack
The first time you run it, it has to compile the OpenCL kernels. This took about a minute for me, but after compiling it's very fast. On my system 100 frames takes 10 seconds or so.
I saw the kernel being compiled but even after that I was getting only 20 frames-per-MINUTE. Clearly, something isn't right!
Thanks for the perf numbers. What system do you have?
Greg
-
- edward
- Member
- 8081 posts
- Joined: July 2005
- Offline
This on macOS so YMMV. My first thought is that it's probably not on the GPU. This could be for any number of reasons like certain OpenCL features being used by MPM that are not available on the macOS OpenCL driver, or maybe it's not detecting it for some reason. For the latter, start up the Houdini Terminal (under Applications > Houdini > ... > Utilities). Run "hpguinfo -c" and see if it's at least trying to use the GPU for OpenCL.
-
- GregBollella
- Member
- 60 posts
- Joined: June 2021
- Offline
edward
This on macOS so YMMV. My first thought is that it's probably not on the GPU. This could be for any number of reasons like certain OpenCL features being used by MPM that are not available on the macOS OpenCL driver, or maybe it's not detecting it for some reason. For the latter, start up the Houdini Terminal (under Applications > Houdini > ... > Utilities). Run "hpguinfo -c" and see if it's at least trying to use the GPU for OpenCL.
Thanks! Below is what I get when I run hgpuinfo -c. At least H knows the GPU is there

bollella@GregsMacPro-2 ~ % hgpuinfo -c
OpenCL Platform Apple
Platform Vendor Apple
Platform Version OpenCL 1.2 (Jun 28 2024 22:57:31)
OpenCL Device AMD Radeon PRO W6800X Duo Compute Engine
OpenCL Type GPU
Device Version OpenCL 1.2
Frequency 1754 MHz
Compute Units 60
Device Address Bits 32
Global Memory 32752 MB
Max Allocation 8188 MB
Global Cache 0 KB
Max Constant Args 8
Max Constant Size 64 KB
Local Mem Size 64 KB
2D Image Support 16384x16384
3D Image Support 2048x2048x2048
-
- BabaJ
- Member
- 2162 posts
- Joined: Sept. 2015
- Offline
jsmack
The first time you run it, it has to compile the OpenCL kernels. This took about a minute for me, but after compiling it's very fast. On my system 100 frames takes 10 seconds or so.
Interesting...I'm assuming you have a much 'better' system than I.
For 100 frames ('fresh' laydown of mpm configure in new hip instance):
Initial calculations for 100 frames: 28 seconds.
Running those frames(real time toggle off) 1-100 : 1 second
Running those frames(real time toggle on) 1-100: 4 seconds
***********************************************************************
For 20 frames ('fresh' laydown of mpm configure in new hip instance):
Initial calculations for 20 frames: 4 seconds.
Running those frames(real time toggle off) 1-20 : 0 second (too fast to measure)
Running those frames(real time toggle on) 1-100: 1 second ( just under, barely able to measure)
***********************************************************************
Because my system is outdated, when I start up Houdini I get the message that Vulcan is disabled and is resorted to openGL.
I'm wondering since like I said, I'm sure you have a better system - Is there the chance that Vulcan is tying up the GPU in some way that prevents mpm from utilizing the GPU as good as it could/intended; some non-bug conflict?
Or there may be some revision of the default parameters or refinement of mpm code in the daily builds? That we are comparing apples and oranges here?
Or because your system is on Linux?
Of course I can see a difference with the OP since they are on a Mac, and as someone has already mentioned in this thread, potential issues there because of how things are treated on a Mac.
Edited by BabaJ - Aug. 22, 2024 10:59:59
-
- jsmack
- Member
- 8177 posts
- Joined: Sept. 2011
- Offline
BabaJ
For 100 frames ('fresh' laydown of mpm configure in new hip instance):
Initial calculations for 100 frames: 28 seconds.
Running those frames(real time toggle off) 1-100 : 1 second
Running those frames(real time toggle on) 1-100: 4 seconds
By running frames do you mean cached playback? I'm only looking at the compute time. To test, I disable caching on the mpm configure node, and then move the playbar directly from frame 1 to frame 100 and time how long it took to display.
My system is Windows 10 Intel Comet Lake i9 with an RTX 3090. This test at default res is too light to fully stress the GPU, it only shows 40% usage. Increasing resolution to 0.005 gets it to 99%, but it is also much much slower.
-
- BabaJ
- Member
- 2162 posts
- Joined: Sept. 2015
- Offline
jsmack
By running frames do you mean cached playback? I'm only looking at the compute time. To test, I disable caching on the mpm configure node, and then move the playbar directly from frame 1 to frame 100 and time how long it took to display.
Yes, when I say running frames I mean the 'cached' playback. What I meant by the 'Initial calculations' is the first time I set it at frame 1 and play the set range.(It takes longer the first 'run' and creates a blue line in the play bar after which I get those faster 'runs' I noted).
Interesting you talk about disable caching on mpm configure node - I'm assuming you mean mpm solver node, as far as I can see there actually is no mpm configure node itself. Selecting mpm configure from TAB menu puts down a number of default node set ups, which have the 3 different mpm nodes, container, collider and solver with the other sop nodes.
But even looking at the mpm solver I see no place where it's explicit in disabling caching - except perhaps ticking the 'Recompile Kernel' parameter on is what you are doing? or adding some text to the Kernal Options parameter of the mpm solver?
-
- jsmack
- Member
- 8177 posts
- Joined: Sept. 2011
- Offline
BabaJjsmack
By running frames do you mean cached playback? I'm only looking at the compute time. To test, I disable caching on the mpm configure node, and then move the playbar directly from frame 1 to frame 100 and time how long it took to display.
Yes, when I say running frames I mean the 'cached' playback. What I meant by the 'Initial calculations' is the first time I set it at frame 1 and play the set range.(It takes longer the first 'run' and creates a blue line in the play bar after which I get those faster 'runs' I noted).
Interesting you talk about disable caching on mpm configure node - I'm assuming you mean mpm solver node, as far as I can see there actually is no mpm configure node itself. Selecting mpm configure from TAB menu puts down a number of default node set ups, which have the 3 different mpm nodes, container, collider and solver with the other sop nodes.
But even looking at the mpm solver I see no place where it's explicit in disabling caching - except perhaps ticking the 'Recompile Kernel' parameter on is what you are doing? or adding some text to the Kernal Options parameter of the mpm solver?
I don't know what node it was, but it's the same dop network cache feature on all dop based solvers. Disabling caching is done to make the time testing repeatable and not give the possibility of testing cached playback. There might also be a performance hit in copying the simulation from the gpu to memory in order to cache it. Cached playback performance isn't a useful thing to test-at least not when measuring simulation performance.
-
- BabaJ
- Member
- 2162 posts
- Joined: Sept. 2015
- Offline
jsmack
I don't know what node it was, but it's the same dop network cache feature on all dop based solvers.
Yes...I wanted to see specifically...so like with a pyro, there is a 'Cache' tab on the DOP Network node under which a tick box for Cache Simulation parameter is available.
However, with mpm configure there is no DOP Network node (at least none 'exposed') and there neither is a Cache Simulation tick box parameter either anywhere to be found.
-
- jsmack
- Member
- 8177 posts
- Joined: Sept. 2011
- Offline
-
- BabaJ
- Member
- 2162 posts
- Joined: Sept. 2015
- Offline
jsmack
Advanced tab under simulation: checkbox beside the Cache Memory slider.
Thanks jsmack for pointing out the 'obvious' - think my eyes glazed over that one because of the slider part.
But for the sake of the thread, with the Cache Memory slider ticked OFF, in version 20.5.332, "MPM configure" default set up as is:
I get 20 Frames in 4 seconds.
100 Frames in 28 seconds.
-
- GregBollella
- Member
- 60 posts
- Joined: June 2021
- Offline
BabaJ
But for the sake of the thread, with the Cache Memory slider ticked OFF, in version 20.5.332, "MPM configure" default set up as is:
I get 20 Frames in 4 seconds.
100 Frames in 28 seconds.
Same setup, new scene, MPM Configure, no caching, after kernel compile.
On my system I get:
First 20 frames -> 55 seconds
First 100 frames -> 5 minutes and 21 seconds.
Greg
-
- BabaJ
- Member
- 2162 posts
- Joined: Sept. 2015
- Offline
GregBollella
Same setup, new scene, MPM Configure, no caching, after kernel compile.
On my system I get:
First 20 frames -> 55 seconds
First 100 frames -> 5 minutes and 21 seconds.
Yeah, I would definitely put in a RFE/Bug report(And maybe give a link to this thread as both jsmack and I list our machines) and hopefully they can give you some feedback as to what's going on.
You have a much better system than I and should be getting much better performance.
-
- jsmack
- Member
- 8177 posts
- Joined: Sept. 2011
- Offline
GregBollella
Same setup, new scene, MPM Configure, no caching, after kernel compile.
On my system I get:
First 20 frames -> 55 seconds
First 100 frames -> 5 minutes and 21 seconds.
Greg
Manually changing HOUDINI_OCL_DEVICE to CPU I get:
20 frames: 12.9s
100 frames 1:19s
10 Core Intel i9 10850k at 4.8Ghz
Do you get the same performance if you force CPU mode?
-
- GregBollella
- Member
- 60 posts
- Joined: June 2021
- Offline
Setting HOUDINI_OCL_DEVICE to CPU I get (Intel Xeon W 16-core 4.2 GHz)
20 frames: 40 seconds
100 frames: 4 minutes 14 seconds
So, slightly faster for me on the CPU vs the GPU. But, still nowhere near your numbers. The cores are very lightly used when then sim is running. Total utilization is only about 56% (about half of a core).
Below are some env vars (froM hconfig in the Houdini terminal). Seems like some important values are missing. Could you do a similar dump so we can compare?
bollella@GregsMacPro-2 ~ % hconfig -a | grep OCL
HOUDINI_OCL_DEVICENUMBER := -1
HOUDINI_OCL_DEVICETYPE := ''
HOUDINI_OCL_FEATURE_DISABLE := '<not defined>'
HOUDINI_OCL_IMAGE_ADVECTION := 1
HOUDINI_OCL_MEMORY_POOL_SIZE := 0.125
HOUDINI_OCL_OGL_INTEROP := 1
HOUDINI_OCL_PATH := '<not defined>'
HOUDINI_OCL_REPORT_BUILD_LOGS := 0
HOUDINI_OCL_REPORT_MEMORY_USE := 0
HOUDINI_OCL_VENDOR := '<not defined>'
HOUDINI_USE_HFS_OCL := 1
20 frames: 40 seconds
100 frames: 4 minutes 14 seconds
So, slightly faster for me on the CPU vs the GPU. But, still nowhere near your numbers. The cores are very lightly used when then sim is running. Total utilization is only about 56% (about half of a core).
Below are some env vars (froM hconfig in the Houdini terminal). Seems like some important values are missing. Could you do a similar dump so we can compare?
bollella@GregsMacPro-2 ~ % hconfig -a | grep OCL
HOUDINI_OCL_DEVICENUMBER := -1
HOUDINI_OCL_DEVICETYPE := ''
HOUDINI_OCL_FEATURE_DISABLE := '<not defined>'
HOUDINI_OCL_IMAGE_ADVECTION := 1
HOUDINI_OCL_MEMORY_POOL_SIZE := 0.125
HOUDINI_OCL_OGL_INTEROP := 1
HOUDINI_OCL_PATH := '<not defined>'
HOUDINI_OCL_REPORT_BUILD_LOGS := 0
HOUDINI_OCL_REPORT_MEMORY_USE := 0
HOUDINI_OCL_VENDOR := '<not defined>'
HOUDINI_USE_HFS_OCL := 1
-
- AlexandreSV
- Staff
- 149 posts
- Joined: June 2023
- Offline
Hi Greg, please do log a bug about these performances. I don't know why you are experiencing this but it is worth investigating. We had Mac users testing MPM during the Alpha/Beta and I didn't get reports of massive slow down similar to what you are describing here. I will try to reproduce this on our end and will get back to you with an update. Cheers!
-
- GregBollella
- Member
- 60 posts
- Joined: June 2021
- Offline
AlexandreSV
Hi Greg, please do log a bug about these performances. I don't know why you are experiencing this but it is worth investigating. We had Mac users testing MPM during the Alpha/Beta and I didn't get reports of massive slow down similar to what you are describing here. I will try to reproduce this on our end and will get back to you with an update. Cheers!
Thanks Alexandre! Here's the ticket number, SideFX Support Ticket: #156502
BTW, I worked with the support team at INSYDIUM a while back on a bug the the GPUs I have and Nexus. I believe the problem that was eventually found was that MacOS/GPU were reporting a subgroup size of 64 but behaving as though it was 32. But, take this with a grain of salt as I don't remember exactly and the beta thread is no longer available.
The symptoms of the Nexus issue were different but I thought it worth mentioning.
Greg
-
- Quick Links


