Hi brians,
Thanks for the recommendations.
brians
I think the first thing to try is to render with only the EmbreeCPU device active on that machine
Rendering with this EnvVar enabled on the same machine that has the 4090 yielded a much faster render time using purely the CPU.
No envvar = an average of 6mins 00secs
With envvar = an average of 3mins 40secs
After resetting the job a few times with and without the envvars, I also noted some inconsistent behavior.
No envvar.
frame 101 = 2min 50secs
frame 102 = 4min 55secs
frame 103 = 3min 10secs
frame 104 = 2min 45secs
frame 105 = 12min 30secs
We actually see the first and 4th frame using the GPU more than the CPU, which outputs the kind of render time we're expecting to see. However subsequent frames are much longer again. Most frequently averaging as mentioned above.
So possibly slowing down when GPU VRAM is overloaded? Or something else?
A separate set of frames (Karma XPU, same settings, same scene, no envvar. This had a "catmullClark" scheme on the geometries. So more likely to use extra memory) for example:
frame 396 = 7min 25secs
frame 397 = 6min 25secs
frame 398 = 6min 25secs
frame 399 = 7min 30secs
frame 400 = 7min 30secs
Over on the other thread regarding the excessive memory usage:
https://www.sidefx.com/forum/topic/95306/?page=1#post-419135 [
www.sidefx.com]
I removed all the subdvisions for the scene, which reduced memory usage (still not as low as it should be) but the 24GB VRAM is still maxing out on the GPU.
No "cudaErrorMemoryAllocation" reported in the log.
I also increased the verbosity, but not seeing much useful in the logs:
Karma XPUNo envvar.
R622| [14:45:39] Unified Cache: 363.29 MiB of 31.97 GiB used
R623| [14:45:39] In Cache Faults Generated
R624| [14:45:39] 363.29 MiB 48,825 363.29 MiB 7.62 KiB
R625| [14:45:39] TBF Cache: 310 file opens (133.66 MiB read)
R626| [14:45:39] RAT Disk Cache: 60 hits
R627| [14:45:39] accept_unmipped : 1
R628| [14:45:39] accept_untiled : 1
R629| [14:45:39] automip : 0
R630| [14:45:39] autoscanline : 1
R631| [14:45:39] autotile : 512
R632| [14:45:39] deduplicate : 1
R633| [14:45:39] failure_retries : 0
R634| [14:45:39] forcefloat : 0
R635| [14:45:39] max_errors_per_file : 100
R636| [14:45:39] max_memory_MB : 63.94 GiB
R637| [14:45:39] max_mip_res : 1,073,741,824
R638| [14:45:39] max_open_files : 1024
R639| [14:45:39] searchpath : ''
R640| [14:45:39] trust_file_extensions : 0
R641| [14:45:39] unassociatedalpha : 0
R658| OpenImageIO ImageCache statistics (shared) ver 2.3.14
R659| Options: max_memory_MB=65473.0 max_open_files=1024 autotile=512
R660| autoscanline=1 automip=0 forcefloat=0 accept_untiled=1
R661| accept_unmipped=1 deduplicate=1 unassociatedalpha=0
R662| failure_retries=0
R663| Images : 63 unique
R664| ImageInputs : 62 created, 2 current, 12 peak
R665| Total pixel data size of all images referenced : 2.6 GB
R666| Total actual file size of all images referenced : 979.9 MB
R667| Pixel data read : 239.7 MB
R668| File I/O time : 14.1s (0.2s average per thread, for 81 threads)
R669| File open time only : 0.2s
R670| Tiles: 475 created, 472 current, 472 peak
R671| total tile requests : 157261461
R672| micro-cache misses : 92888 (0.059066%)
R673| main cache misses : 475 (0.000302045%)
R674| redundant reads: 0 tiles, 0 B
R675| Peak cache memory : 239.7 MB
R761| [14:45:39] Object Counts:
R762| [14:45:39] Cameras: 1
R763| [14:45:39] Coordinate Spaces: 0
R764| [14:45:39] Curve Meshes: 0
R765| [14:45:39] Light Tree: 0
R766| [14:45:39] Lights: 2
R767| [14:45:39] Point Meshes: 0
R768| [14:45:39] Polygon Meshes: 285,891 total 1,627 unique
R769| [14:45:39] Volumes: 0
R770| [14:45:39] Geometry Counts:
R771| [14:45:39] Curves: 0
R772| [14:45:39] Points: 0
R773| [14:45:39] Polygons: 60,271,321 total 14,230,441 unique
R774| [14:45:39] Polygons (Diced): 2,070,273
R775| [14:45:39] Light Types:
R776| [14:45:39] Cylinder: 0
R777| [14:45:39] Disk: 0
R778| [14:45:39] Distant: 1
R779| [14:45:39] Dome: 1
R780| [14:45:39] Geometry: 0
R781| [14:45:39] Line: 0
R782| [14:45:39] Point: 0
R783| [14:45:39] Rectangle: 0
R784| [14:45:39] Sphere: 0
R785| [14:45:39] Shader Nodes:
R786| [14:45:39] CPU Shaders: 107 total 15 unique
R787| [14:45:39] Function Errors: 0
R788| [14:45:39] Functions Loaded: 93
R789| [14:45:39] Largest Shader: 64
R790| [14:45:39] Shader Nodes: 5,502
R791| [14:45:39] Shaders: 396
R792| [14:45:39] USD Preview Shaders: 2
R793| [14:45:39] Ray Counts:
R794| [14:45:39] Camera Rays: 265,420,800
R795| [14:45:39] Indirect: 283,525,303
R796| [14:45:39] Light Geometry: 0
R797| [14:45:39] Occlusion: 1,009,531,091
R798| [14:45:39] Probe: 5,733,485
R799| [14:45:39] Total: 1,564,210,679
R800| [14:45:39] Shader Calls:
R801| [14:45:39] Displacement: 188,534,307
R802| [14:45:39] Emission: 0
R803| [14:45:39] Light: 0
R804| [14:45:39] Opacity: 0
R805| [14:45:39] Surface: 0
R806| [14:45:39] Volume: 0
R807| [14:45:39] Primvar Cache: 35,386 hits, 12,414 misses
R808| [14:45:39] Primvar Memory Usage Actual Uncompressed
R809| [14:45:39] real32[3] <dicedmesh> 9.42 GiB 9.43 GiB
R810| [14:45:39] int32 <topology> 1.99 GiB 2.25 GiB
R811| [14:45:39] real32[3] N 63.30 MiB 598.36 MiB
R812| [14:45:39] real32[3] Pref 18.25 MiB 166.57 MiB
R813| [14:45:39] real32[3] P 18.25 MiB 166.44 MiB
R814| [14:45:39] int32 QuadVerts 8.23 MiB 115.07 MiB
R815| [14:45:39] int32 TriVerts 823.73 KiB 11.25 MiB
R816| [14:45:39] real32[2] st 594.35 KiB 3.34 MiB
R858| [14:45:39] Bucket Time Breakdown:
R859| [14:45:39] Category Time Percentage
R860| [14:45:39] Dicing 0:00:14.62 100.00
R861| [14:45:39] Filtering 0:00:00 0.00
R862| [14:45:39] Indirect rays 0:00:00 0.00
R863| [14:45:39] Lighting 0:00:00 0.00
R864| [14:45:39] Primary rays 0:00:00 0.00
R865| [14:45:39] SSS samples 0:00:00 0.00
R866| [14:45:39] Shading 0:00:00 0.00
R867| [14:45:39] Shadows 0:00:00 0.00
R868| [14:45:39] Unaccounted 0:00:00 0.00
R869| [14:45:39] Total Wall Clock Time: 0:09:29.68
R870| [14:45:39] Total CPU Time: 0:00:07.55
R871| [14:45:39] System CPU Time Only: 0:00:05.58
R872| [14:45:39] Current Memory Usage: 87.45 GiB
R873| [14:45:39] Peak Memory Usage: 87.45 GiB
R874| [14:45:40] Image save time: 0:00:00.47
Karma XPUWith envvar. (KARMA_XPU_DISABLE_OPTIX_DEVICE=1)
R562| [15:00:14] Unified Cache: 355.30 MiB of 31.97 GiB used
R563| [15:00:14] In Cache Faults Generated
R564| [15:00:14] 355.30 MiB 47,752 355.30 MiB 7.62 KiB
R565| [15:00:14] TBF Cache: 441 file opens (127.25 MiB read)
R566| [15:00:14] RAT Disk Cache: 56 hits
R567| [15:00:14] accept_unmipped : 1
R568| [15:00:14] accept_untiled : 1
R569| [15:00:14] automip : 0
R570| [15:00:14] autoscanline : 1
R571| [15:00:14] autotile : 512
R572| [15:00:14] deduplicate : 1
R573| [15:00:14] failure_retries : 0
R574| [15:00:14] forcefloat : 0
R575| [15:00:14] max_errors_per_file : 100
R576| [15:00:14] max_memory_MB : 63.94 GiB
R577| [15:00:14] max_mip_res : 1,073,741,824
R578| [15:00:14] max_open_files : 4096
R579| [15:00:14] searchpath : ''
R580| [15:00:14] trust_file_extensions : 0
R581| [15:00:14] unassociatedalpha : 0
R582| [15:00:14] OpenImageIO Stats String:
R583| [15:00:14] OpenImageIO Texture statistics
R584| Options: gray_to_rgb=0 flip_t=0 max_tile_channels=6
R585| Queries/batches :
R586| texture : 156777427 queries in 156777427 batches
R587| texture 3d : 0 queries in 0 batches
R588| shadow : 0 queries in 0 batches
R589| environment : 0 queries in 0 batches
R590| gettextureinfo : 0 queries
R591| Interpolations :
R592| closest : 0
R593| bilinear : 156777427
R594| bicubic : 0
R595| Average anisotropic probes : 1
R596| Max anisotropy in the wild : 1
R597|
R598| OpenImageIO ImageCache statistics (shared) ver 2.3.14
R599| Options: max_memory_MB=65473.0 max_open_files=4096 autotile=512
R600| autoscanline=1 automip=0 forcefloat=0 accept_untiled=1
R601| accept_unmipped=1 deduplicate=1 unassociatedalpha=0
R602| failure_retries=0
R603| Images : 63 unique
R604| ImageInputs : 62 created, 6 current, 11 peak
R605| Total pixel data size of all images referenced : 2.6 GB
R606| Total actual file size of all images referenced : 979.9 MB
R607| Pixel data read : 239.7 MB
R608| File I/O time : 19.8s (0.2s average per thread, for 99 threads)
R609| File open time only : 0.2s
R610| Tiles: 476 created, 472 current, 472 peak
R611| total tile requests : 157261461
R612| micro-cache misses : 87199 (0.0554484%)
R613| main cache misses : 476 (0.000302681%)
R614| redundant reads: 0 tiles, 0 B
R615| Peak cache memory : 239.7 MB
R701| [15:00:14] Object Counts:
R702| [15:00:14] Cameras: 1
R703| [15:00:14] Coordinate Spaces: 0
R704| [15:00:14] Curve Meshes: 0
R705| [15:00:14] Light Tree: 0
R706| [15:00:14] Lights: 2
R707| [15:00:14] Point Meshes: 0
R708| [15:00:14] Polygon Meshes: 285,891 total 1,627 unique
R709| [15:00:14] Volumes: 0
R710| [15:00:14] Geometry Counts:
R711| [15:00:14] Curves: 0
R712| [15:00:14] Points: 0
R713| [15:00:14] Polygons: 60,271,321 total 14,230,441 unique
R714| [15:00:14] Polygons (Diced): 1,657,614
R715| [15:00:14] Light Types:
R716| [15:00:14] Cylinder: 0
R717| [15:00:14] Disk: 0
R718| [15:00:14] Distant: 1
R719| [15:00:14] Dome: 1
R720| [15:00:14] Geometry: 0
R721| [15:00:14] Line: 0
R722| [15:00:14] Point: 0
R723| [15:00:14] Rectangle: 0
R724| [15:00:14] Sphere: 0
R725| [15:00:14] Shader Nodes:
R726| [15:00:14] CPU Shaders: 107 total 15 unique
R727| [15:00:14] Function Errors: 0
R728| [15:00:14] Functions Loaded: 93
R729| [15:00:14] Largest Shader: 64
R730| [15:00:14] Shader Nodes: 5,502
R731| [15:00:14] Shaders: 396
R732| [15:00:14] USD Preview Shaders: 2
R733| [15:00:14] Ray Counts:
R734| [15:00:14] Camera Rays: 265,420,800
R735| [15:00:14] Indirect: 283,525,303
R736| [15:00:14] Light Geometry: 0
R737| [15:00:14] Occlusion: 1,009,531,114
R738| [15:00:14] Probe: 5,733,485
R739| [15:00:14] Total: 1,564,210,702
R740| [15:00:14] Shader Calls:
R741| [15:00:14] Displacement: 188,534,307
R742| [15:00:14] Emission: 0
R743| [15:00:14] Light: 0
R744| [15:00:14] Opacity: 0
R745| [15:00:14] Surface: 0
R746| [15:00:14] Volume: 0
R747| [15:00:14] Primvar Cache: 35,386 hits, 12,414 misses
R748| [15:00:14] Primvar Memory Usage Actual Uncompressed
R749| [15:00:14] real32[3] <dicedmesh> 9.42 GiB 9.43 GiB
R750| [15:00:14] int32 <topology> 1.99 GiB 2.25 GiB
R751| [15:00:14] real32[3] N 63.30 MiB 598.36 MiB
R752| [15:00:14] real32[3] Pref 18.25 MiB 166.57 MiB
R753| [15:00:14] real32[3] P 18.25 MiB 166.44 MiB
R754| [15:00:14] int32 QuadVerts 8.23 MiB 115.07 MiB
R755| [15:00:14] int32 TriVerts 823.73 KiB 11.25 MiB
R756| [15:00:14] real32[2] st 594.35 KiB 3.34 MiB
R798| [15:00:14] Bucket Time Breakdown:
R799| [15:00:14] Category Time Percentage
R800| [15:00:14] Dicing 0:00:14.58 100.00
R801| [15:00:14] Filtering 0:00:00 0.00
R802| [15:00:14] Indirect rays 0:00:00 0.00
R803| [15:00:14] Lighting 0:00:00 0.00
R804| [15:00:14] Primary rays 0:00:00 0.00
R805| [15:00:14] SSS samples 0:00:00 0.00
R806| [15:00:14] Shading 0:00:00 0.00
R807| [15:00:14] Shadows 0:00:00 0.00
R808| [15:00:14] Unaccounted 0:00:00 0.00
R809| [15:00:14] Total Wall Clock Time: 0:03:47.37
R810| [15:00:14] Total CPU Time: 0:00:08.39
R811| [15:00:14] System CPU Time Only: 0:00:02.08
R812| [15:00:14] Current Memory Usage: 61.54 GiB
R813| [15:00:14] Peak Memory Usage: 61.54 GiB
R814| [15:00:15] Image save time: 0:00:00.47
thanks,
amwilkins