I've got this working. I'm on W11 Pro, Houdini 21.0.440, with a 3090 (0) and 4090 (1).
Install the Nvidia CUDA Toolkit 12.8 per the
Houdini installation docs [
www.sidefx.com]. I installed cuDNN 9.15. The installer does not add the directory to PATH, which is
C:\Program Files\NVIDIA\CUDNN\v9.15\bin\12.9. I did not need to copy over
zlib.dll.
Even though the 3090 is the primary GPU, when set to CUDA it uses the 4090 (conveniently ideal for my use-case). Setting CUDA_VISIBLE_DEVICES does nothing. When using DirectML, it uses the 3090.
Because of this, the following benchmarks aren't apples to apples.
240f Large Neural Surface Test:
- CUDA (4090) - 159.7s
- DirectML (3090) - 272.5s
60f Small Neural Surface Test:
- CUDA (4090) - 11.8s
- DirectML (3090) - 17.9s
- CPU (9950x) - 42.6s