Hi Luke,
Sorry, just looked at this (somehow the did bug not show up in the bug database).
When debugging whitewater problems, the first place to look is always Whitewater Source, so you can see where you're actually emitting whitewater. In your case if you look at that node, you'll see the entire FLIP sim is showing up there as emission points!
This is because you're not importing and writing out your surface and vel fields to disk along with your FLIP particles. When Whitewater Source calculates acceleration, it compares the current points to the previous frame's velocity field. If the vel field is missing, it's always comparing the particles' velocity to 0, so there's always acceleration (admittedly we should do a much better job of erroring out if the vel or surface fields are not there.) Anyway, in the current file you're sending in your entire FLIP sim as emitter particles (over 12M points), and it's replicating whitewater points at each particle; not surprisingly you're quickly running out of memory. (Houdini is also particularly memory-hungry under Windows, unfortunately.)
However, even if you fix this, you've got your initial particles situated a fair bit above the bottom of fluid tank for the FLIP sim, so they immediately fall down due to gravity and make a big splash, adding more acceleration. You need to make sure your Volume Limits align with the edges of your initial particle box, or just use the FLIP tank shelf tool.
With those issues fixed your whitewater sim should no longer run out of memory. However for some reason the .hip file preferences have Unit Length 10M = 1M (Preferences | Hip File Options menu item). It appears that was set before you created the whitewater sim but after the FLIP sim, so those two sims are operating at different scales and your gravity, buoyancy, foam depth, etc. on the whitewater sim are all in the wrong units and look strange.
Finally you might consider animating your ship at the Object level and turning off Use Deforming Geometry. Not a really big deal, but it would avoid re-calculating the collision SDF each timestep.
I've attached a file with a few of these fixes that yields a reasonable whitewater sim.
Edit: sorry, forgot one more thing: you're also visualizing the different whitewater particles by type, which is fine, but internally actually creates a copy of all the points with colors attached. So that makes the memory use worse if you're displaying the points while simulating.
For big whitewater sims you're better off writing straight to disk, then if you want to visualize using the various point groups for foam, spray, etc. and coloring them in SOPs for visualization.
Found 183 posts.
Search results Show results as topic list.
Houdini Lounge » Whitewater Memory Leaks
- johner
- 815 posts
- Offline
Technical Discussion » Hacking DOP POP Collisions
- johner
- 815 posts
- Offline
jacob clark
I should have mentioned, I was referring to VOP SDF Pop collisions.
Sorry, I misunderstood. You're right there's nothing built-in to do particle-SDF collisions without bringing the SDFs into DOPs. It has been requested in the past (RFE: 65193)
jacob clark
The speed gain is tremendous.
Investigating this I realized we were doing an excessive amount of work in Volume Sample mode when the incoming collision SDF has no transform on it (i.e. Use Object Transform is off), as in your case. For that setup we can just copy the raw voxel values into DOPs rather than sampling in world space. I made the following change for tomorrow's build:
Houdini 14.0.228: Wed Jan 28 15:21:45 CST 2015
Optimized the Volume Sample mode for importing collision SDF's
for the Static Object DOP, when its Use Object Transform parameter is off.
Also updated the Deforming Object shelf tool to turn off Use Object
Transform when the collision object has no object-level scaling.
For your test file this dropped the GasIntegrator average solve time from 139ms to 5ms. Total time is not much higher than the VOP solution, although you do still have to deal with the large collision object filling up your DOPnet cache. On the plus side the collisions will be much more accurate, and in general I think you'll find DOP POPs much faster than old POPs once you scale up to high particle counts.
Please let us know if you get a chance to try this fix.
Technical Discussion » Hacking DOP POP Collisions
- johner
- 815 posts
- Offline
jacob clark
Due to the nature of the shots I am working on, it doesn't look like the collision shapes will get any simpler. I do enjoy that the Pyro solver can use a straight volume field for collisions. I would love to see a similar simple/quick treatment applied to POPs.
Hmm, well the Pyro solver does have to import the VDB into DOPs, in this case into a temporary scalar field that it then unions / adds into the existing collision fields, inside the SourceVolume asset. I'm a bit surprised it's much faster, since it's a similar operation of sampling VDBs into Houdini native volumes.
Maybe because the Pyro collision fields are lower resolution than the VDB? Static Object's Volume Proxy mode always creates an SDF of the same size / resolution as the input VDB before sampling into it. So you might be seeing better sampling times just due to the lower resolution collision field in Pyro?
jacob clark
I can stick w/ old POPs for now, and can't wait to see if there are more speed improvements on this collision model ( bypassing of the sdf rebuild? )
Have you tried turning off Use Volume Based collisions and using straight particle->poly collisions? I'm not familiar with old POPs collisions, but I think that's all it's doing. (if maybe using a different code path)
Technical Discussion » Hacking DOP POP Collisions
- johner
- 815 posts
- Offline
jacob clark
While I can't get the #2 to work properly (DOP POPs seems to have a lot more to think about when it comes to resetting a particles velocity) you'll notice that it is almost 20x faster then #1 (this time difference grows exponentially with an increase in the SDF resolution).
Looking through the performance monitor. For #1, It seems the biggest slow down is the collisions on the particle integrator.
I guess my question is, how can I get #2 to work. That is, read in my own SDF, ignore the POP DOP particle collisions (slow), and have a stable sim? Is there a DOP node that helps achieve my goal, can I achieve it in VOPs, or is it all under the hood.
Ah. The time you're seeing in the particle integrator is actually when it builds the collision object's internal DOP SDF on demand. In other words, that's when the actual Volume Sample from VDB to DOPs SDF is happening, which is why it gets slower at higher res. The actual particle collisions themselves are quite fast once that conversion is done (less than a millisecond in this example) and should scale well to many more particles, at least in H14.
You could use the File Mode on the Static Object to write those converted volumes to disk in the internal DOP format as .sim data, but that just moves most of the time to loading the data from disk since now it's a dense format instead of the sparse VDB. (Though that is a good way to verify that it's not Gas Integrator's collisions taking all the time.)
Unfortunately all the built-in DOPs nodes assume the SDF data is somehow accessible as DOPs data, whether as collision SDFs on objects or scalar fields, so either way you have to first import / convert them to DOPs somehow.
You could write your own POP Wrangle that volumesamples from a VDB directly and uses the result of volumegradient to get a normal, plus maybe point clouds to get the nearest point velocity, but none of that is really easy. Also the internal SDF collision code does a lot to sweep out the collisions over time and get watertight collisions even for fast moving particles. But that would be your best bet.
So I guess the good news is if you use regular DOPs SDF collisions, your setup shouldn't get much slower for very large numbers of particles; the bad news is you have a significant baseline cost at that high a VDB resolution.
Another option is to turn off Use Volume Based Collisions, which skips SDF creation altogether and just uses the polygons. Depending on the resolution of your mesh that may or may not be faster.
Either way simplifying / reducing complexity on the mesh as much as possible is a good idea before sending into DOPs.
Technical Discussion » Hacking DOP POP Collisions
- johner
- 815 posts
- Offline
circusmonkey
You can already , just make your object a volume and also select it as a proxy volume
Note that in H14 the Deforming Object shelf tool will set this up for you. If your object really isn't deforming, just uncheck Use Deforming Geometry on the resulting Static Object DOP.
Internally the Gas Collision Detect and Gas Integrator use the same code path for SDF collisions, and both were significantly optimized for H14.
Houdini Lounge » Viscous flip fluid with bounce
- johner
- 815 posts
- Offline
Technical Discussion » Open CL Settings
- johner
- 815 posts
- Offline
sekow
Everything beyond 380 Divisions wont work, despite the fact that 4 GB are left untouched.
Unfortunately the current Nvidia OpenCL drivers only allow addressing 32-bits of memory, meaning Houdini (and all OpenCL clients) can't use anything above 4GB on the larger Nvidia cards. We are in contact with Nvidia about the issue, but unfortunately have no news to report. This limitation does not exist under CUDA, so it's not a hardware issue.
The Intel CPU drivers are naturally 64-bit on 64-bit OS's, since they share the operating system memory space. Hence the very large sims possible with it.
Although untested, it might also be possible to use above 4GB on AMD cards via an environment variable:
http://devgurus.amd.com/message/1286769 [devgurus.amd.com]
Technical Discussion » viscous fluid through a pipe
- johner
- 815 posts
- Offline
Skybar
The error message it gives you says it all; “No viscosity solve because all fluid was extrapolated into the collision field”.
In other words, reduce Surface Extrapolation under Volume Motion -> Collisions on the FLIP solver. OR, reduce the Particle Separation, which in turn will give you smaller voxels and then the Surface Extrapolation won't have the same radius at the same value.
Right, that error comes from this change:
Friday, November 15, 2013
Houdini 13.0.235: The Viscosity solver now attempts to detect bad collision fields, which can lead to no viscosity solve or freezing fluids, and issues a warning in that case.
It could probably offer a more useful warning, but this situations typically arises from a “bad” collision field, which in the case of a collision VDB usually means insufficient External Bandwidth. In that case all distance queries return a value inside the extrapolation distance, and the entire field incorrectly looks like it is filled with a collision object.
Houdini Learning Materials » Pop replicate Life expectancy
- johner
- 815 posts
- Offline
This should be fixed in tomorrow's daily build, along with a couple of other POP Replicate issues:
Fixed several issues with POP Replicate: The Use Inherited
Velocity setting was resulting in zero initial velocity.
The lifespan of replicated particles was always inherited from
the source particles; now it is always set from the Life
parameters. Finally, the Inherit Velocity parameter was being
applied twice, so its effect was squared. This has been corrected.
Use the square of your current values to match existing simulations.
Houdini Lounge » PyroFX smoke bug
- johner
- 815 posts
- Offline
Sometimes MacCormack advection can cause minor artifacts, so it might be worth trying BFECC advection for density and velocity. It's a bit more expensive but less prone to artifacts. (Advanced | Advection tab)
Technical Discussion » Tesla with Houdini 12 functionality
- johner
- 815 posts
- Offline
5DN
I'd tried your advice and loads of alternate methods I can think of to no avail.
I get error message:
OpenCL Exception: No OpenCL platform found. (-32)
Hmm, you get this error even when no environment variables are set? It could be a driver problem, but by default the Nvidia drivers should include an OpenCL driver. Your regular Nvidia video card should at least show up in the About Houdini | Details box as an OpenCL device.
Have you tried any other OpenCL-based software like LuxMark?
http://www.luxrender.net/wiki/LuxMark [luxrender.net]
Technical Discussion » FLIP pressure projection - threading issues
- johner
- 815 posts
- Offline
Very interesting tests, Dan. As much benchmarking as we've done of FLIP, I don't think we've ever tested hardware affinity with Use Preconditioner off. Jacobi preconditioning is a very simple pre-conditioning scheme that is trivial to multithread but consists of a few operations over very large memory buffers, so represents the classic case for being memory bandwidth bound.
There's not currently any way of setting affinity at the level of detail you're describing in Houdini. We're mostly limited to what tbb (Intel's Threaded Building Blocks) gives us, and there's only limited support for hardware affinity (you can read what they offer here [software.intel.com] if interested)
As I understand it, we might see a speedup using the TBB affinity tools if everything fit into the cache, which for large FLIP sims is clearly not the case. There might still be some speedup available from enforcing thread affinity across internal iterations of the pressure solve, but we'd have to do some testing. Frankly I'm skeptical it will benefit, as I think the limiting factor is on which bus the memory is allocated in the first place. I'll put in an RFE to test thread-level affinity, however.
(The TBB forums have several posts asking how to solve the NUMA problem, so you've identified a common one, I'm afraid).
There's not currently any way of setting affinity at the level of detail you're describing in Houdini. We're mostly limited to what tbb (Intel's Threaded Building Blocks) gives us, and there's only limited support for hardware affinity (you can read what they offer here [software.intel.com] if interested)
As I understand it, we might see a speedup using the TBB affinity tools if everything fit into the cache, which for large FLIP sims is clearly not the case. There might still be some speedup available from enforcing thread affinity across internal iterations of the pressure solve, but we'd have to do some testing. Frankly I'm skeptical it will benefit, as I think the limiting factor is on which bus the memory is allocated in the first place. I'll put in an RFE to test thread-level affinity, however.
(The TBB forums have several posts asking how to solve the NUMA problem, so you've identified a common one, I'm afraid).
Houdini Indie and Apprentice » highly viscous flip fluids "sticky" by default?
- johner
- 815 posts
- Offline
robinlawrie
one quick question.. using Johner's method, the mud wont stick to the tyre and wrap around it.. however does this mean i will lose the effect of mud being thrown out backwards by the wheel if it spins?
possibly what Jeff was getting at with his suggestion to control stickyness by wheel rotation velocity.
For coarse control you can just turn down the Scale parameter on the Gas Stick on Collision in the setup I attached. That will just use more of the original collision velocity and add partial stickiness back into the sim.
For higher grain control, yes you might try one of Jeff's excellent suggestions.
Houdini Indie and Apprentice » highly viscous flip fluids "sticky" by default?
- johner
- 815 posts
- Offline
robinlawrie
im not sure if im asking a silly question, or if nobody knows the answer, but this has really stopped me dead at the moment. i want a mud that can be pushed around but not stick to geometry unless i specify.
Definitely not a silly question. The viscosity solver currently only supports a “no slip” condition when considering collision velocities, which essentially means the “stickiness” between collisions and fluid is the same as internally to the fluid. Improving the viscosity solver to support a varying amount of stickiness is a good RFE; I'll add it to the bug db.
In the meantime you can cheat a bit by manipulating the collision velocity field before it goes into the viscosity and pressure solves. Essentially you want to mix the fluid velocity that is tangent to the collisions back into the collision velocity field. So when the viscosity solver looks up the collision velocity it will actually use the fluid's velocity in the tangent direction. The pressure solver ignores tangential velocities anyway (enforcing only a non-penetration constraint), so as long as we modify only the tangential velocities we should be OK using that modified collision velocity for both solves.
Fortunately there's a DOP asset that will do this for you: the Gas Stick on Collision DOP, which ironically we can use in reverse to achieve a non-stickiness by reversing the fields it operates on. In this case it mixes “vel” (fluid) into “collisionvel” (collisions).
This won't work at the fluid boundaries since those are handled differently inside the viscosity solver. If you want slippery walls you'll need to create actual collision geometry for them.
And I'm sure there are cases where this won't work that I'm overlooking, but it's worth a try for your setup. See attached .hip and .mp4. The first pass through the flipbook shows with the velocity modification active; the streamers show modified velocities. The second pass through just shows the default behavior.
Edit: I should add, the Stick on Collision DOP supports scaling its effect via an external field, so you should be able to control the location of the effect. Also it's just an asset with a GasFieldVOP inside, if you want to look at how it works.
Technical Discussion » Voronoi Fracture Geo Replacement
- johner
- 815 posts
- Offline
As of 12.5, the Transform Pieces SOP will do exactly this transformation.
http://www.sidefx.com/docs/houdini13.0/nodes/sop/xformpieces [sidefx.com]
http://www.sidefx.com/docs/houdini13.0/nodes/sop/xformpieces [sidefx.com]
Technical Discussion » Memory Leak in H13 with Flips maybe ?
- johner
- 815 posts
- Offline
Solitude
IIt seems to fill up my available ram, but in terms of simming speed it held steady, and the computer was still quite responsive. Still, guess I'll be setting up dual boot this weekend.
Yes, the allocator (tbbmalloc) allocates memory from the OS in pools and manages them itself for speed. When memory gets fragmented, it can't release these pools back to the OS, although it can still use the free memory within them. The Linux allocator (jemalloc) does a much better job of returning that memory to the OS.
Linux has always been the best OS for large scale simulation, but it's even more true at the moment.
tricecold
I suppose this issue applies to other tools in Houdini 13 also, like Pyro, RBD multisolvers etc, on windows, please confirm.
With Reseeding and the boundary layers in the OceanFX tools, the FLIP solver is often creating and deleting large numbers of particles each timestep, which puts a lof of pressure on the memory allocator. So it seems to show the problem the worst by far. Many other tools that don't create and delete memory as frequently won't be affected much (if at all.)
Technical Discussion » Memory Leak in H13 with Flips maybe ?
- johner
- 815 posts
- Offline
With H13 we moved to a new memory allocator under Windows that is much faster than the default one provided by Microsoft. Unfortunately it can fragment memory over time, which looks a lot like a leak, though technically speaking it is not.
Until we can work out a solution, we're recommending running very large simulations under Linux if at all possible.
Until we can work out a solution, we're recommending running very large simulations under Linux if at all possible.
Technical Discussion » Vector Fields to SOP
- johner
- 815 posts
- Offline
Technical Discussion » pory sim(OpenCL) use single or double precision?
- johner
- 815 posts
- Offline
aty84122
If it use double precision, the GTX580 more faster than GTX780?
It uses single precision, except in a few rare (optional) cases that involve calculating the residual error in the solve. So I would not make any evaluation based on double precision performance, at least for Houdini.
Technical Discussion » Intel OpenCL (on Linux) status?
- johner
- 815 posts
- Offline
pbowmar
I notice that HOUDINI_OCL_MEMORY_POOL_SIZE=.5 isn't working for GPU mode, I set it to .5 of my 4gb card, but Houdini reported still only 1gb. Perhaps that's just the diagnostics are wrong?
Just so we're using the same terms, the memory pool is just an acceleration method: allocating memory on the GPU is slow, and we re-use lots of identical-sized memory buffers during the solve, so it makes sense to have a pool of recently-used buffers for quick re-use. The HOUDINI_OCL_MEMORY_POOL_SIZE just sets the amount of total GPU memory available for that pool of buffers. The solver works just fine with it set to zero; in fact you'll have more memory available for the solve itself, but you'll pay a small price in performance.
So normally only a small amount of memory is allocated for the memory pool (12.5 % of total device memory IIRC), but you can increase it if you have lots of memory on the GPU. You usually don't, so I wouldn't necessarily recommend increasing it, since you'll run out of memory for the actual solve sooner. On the other hand, the memory pool turns out to increase performance (more than I would have thought) with the Intel OpenCL on CPU as well, where you might actually have some insanely huge amount of memory, in which case increasing the pool size might make sense.
If you've got HOUDINI_OCL_REPORT_MEMORY_USE set, the Total Memory Allocated is the total amount of device memory allocated. If your GPU is also your display device, you'll never get close to 4GB as there's too much other memory being used for display buffers and textures and such. The In Memory Pool is the amount of memory in the memory pool, while Active Memory is the amount currently taken by the buffers in the current sim. These two numbers should add up to to the Total Memory at the end of each timestep.
-
- Quick Links