Hello,
One of our FX artists has been having a “weird” output in the mantra logs for his jobs and I can't quite pinpoint what is causing that. He's the only one with that problem and we're not quite sure why. It seems to also correlate with long render times
mantra Version 14.0.201.13
IFDs are first exported and then rendered in command line mantra.
Usually we get the progress ( ex: ALF_PROGRESS 12% ) and then render stats at the end. But right now we get the following line several times (500+) between each progress update:
Sample data memory limited from 246.022064 MB to 232.739975 MB (limit is 256.000000 MB)
My guess is that some kind of memory restriction has been setup somewhere in his scene and the message is printed out for every sample but I'm not sure where to start from. Is it a mantra setting or more general Houdini restriction? It would be super appreciated if someone could point me in the right direction.
Thanks!
Mantra log question - Sample data memory limit
7889 8 2-
- cguer
- Member
- 5 posts
- Joined: March 2014
- Offline
-
- neil_math_comp
- Member
- 1743 posts
- Joined: March 2012
- Offline
What's the verbosity set to? That message should only appear at verbosity level 4 or higher.
To support arbitrary pixel filters, (in H13, only linear filters were supported), sample data from ray tracing needs to be cached until all of the output rendering tiles that depend on that sample data have been filtered and output, and one of those can only be processed once all of the sample data it depends on is ready. The amount of data that ends up being cached at any point in time depends on a few things:
* Pixel samples
* Image resolution (usually just the horizontal resolution if it's not rendering to mplay or Render View)
* Filter width of the pixel filter
* Render tile size (default is 16x16)
* Number of image planes & the number of floats in each
* Proportion of rendering tiles that are entirely a constant value in any image plane, since constant tiles are stored as a single entry, instead of all samples
I found that pretty high numbers were required to hit 256MB. We tried a 4K+ render a customer had sent with about 150 image planes, and it didn't hit 200MB. The cache limiting checks if the cache is above 95%, and if so, will lower it to 90% by writing data out to a file, so it should only be hit every ~12.8MB of sample data. I added a not-yet-exposed option to change the cache limit, in case people needed it higher due to a file I/O bottleneck.
Should the message be at a different verbosity level? What are the numbers for the items listed above? Is the file I/O thrashing the network/disk?
To support arbitrary pixel filters, (in H13, only linear filters were supported), sample data from ray tracing needs to be cached until all of the output rendering tiles that depend on that sample data have been filtered and output, and one of those can only be processed once all of the sample data it depends on is ready. The amount of data that ends up being cached at any point in time depends on a few things:
* Pixel samples
* Image resolution (usually just the horizontal resolution if it's not rendering to mplay or Render View)
* Filter width of the pixel filter
* Render tile size (default is 16x16)
* Number of image planes & the number of floats in each
* Proportion of rendering tiles that are entirely a constant value in any image plane, since constant tiles are stored as a single entry, instead of all samples
I found that pretty high numbers were required to hit 256MB. We tried a 4K+ render a customer had sent with about 150 image planes, and it didn't hit 200MB. The cache limiting checks if the cache is above 95%, and if so, will lower it to 90% by writing data out to a file, so it should only be hit every ~12.8MB of sample data. I added a not-yet-exposed option to change the cache limit, in case people needed it higher due to a file I/O bottleneck.
Should the message be at a different verbosity level? What are the numbers for the items listed above? Is the file I/O thrashing the network/disk?
Writing code for fun and profit since... 2005? Wow, I'm getting old.
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
-
- cguer
- Member
- 5 posts
- Joined: March 2014
- Offline
Thanks for your response ndickson. Our verbosity level for all mantra renders is 4 so it would make sense that I can see these warnings. It's just odd that we've never encountered them before.
We are rendering in the cloud (Google compute) as well as locally and these renders take significantly longer in the cloud. Our i/o is slower in the cloud but we've never had that big of a render time difference. But your explanation about writing to a file when it goes over the limit could explain most of the problem. We're using relatively small boot disk for our cloud machines (10GB) and a separate storage pool for the rendered files. I suppose they could fill up relatively quickly with these files and run out of space.
I need to check back with the team to make sure I have all the other info right and show them your explanation but in the meantime here is the pre-render log for one of the problematic jobs if it can give you some info:
Plane: Cf+Af (16-bit float)
SampleFilter: alpha
PixelFilter: gaussian -w 2
VEX Type: vector4
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: P (16-bit float)
SampleFilter: alpha
PixelFilter: minmax edge
VEX Type: vector
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: Pz (16-bit float)
SampleFilter: alpha
PixelFilter: minmax edge
VEX Type: float
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: direct_diffuse (16-bit float)
SampleFilter: alpha
PixelFilter: gaussian -w 2
VEX Type: vector
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: v (16-bit float)
SampleFilter: alpha
PixelFilter: gaussian -w 2
VEX Type: vector
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Waited 0.913369s. for background load tasks
Load Time: 4.63u 2.06s 17.61r
Memory: 1.18 GB
page rclm : 63622 flts: 0
# swaps : 0
blocks in : 1743520 out: 0
switch ctx: 16968 ictx: 262
Thanks again!
We are rendering in the cloud (Google compute) as well as locally and these renders take significantly longer in the cloud. Our i/o is slower in the cloud but we've never had that big of a render time difference. But your explanation about writing to a file when it goes over the limit could explain most of the problem. We're using relatively small boot disk for our cloud machines (10GB) and a separate storage pool for the rendered files. I suppose they could fill up relatively quickly with these files and run out of space.
I need to check back with the team to make sure I have all the other info right and show them your explanation but in the meantime here is the pre-render log for one of the problematic jobs if it can give you some info:
Plane: Cf+Af (16-bit float)
SampleFilter: alpha
PixelFilter: gaussian -w 2
VEX Type: vector4
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: P (16-bit float)
SampleFilter: alpha
PixelFilter: minmax edge
VEX Type: vector
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: Pz (16-bit float)
SampleFilter: alpha
PixelFilter: minmax edge
VEX Type: float
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: direct_diffuse (16-bit float)
SampleFilter: alpha
PixelFilter: gaussian -w 2
VEX Type: vector
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Plane: v (16-bit float)
SampleFilter: alpha
PixelFilter: gaussian -w 2
VEX Type: vector
Gamma: 1
Dither: 0.5
Gain: 1
White point: 1
Waited 0.913369s. for background load tasks
Load Time: 4.63u 2.06s 17.61r
Memory: 1.18 GB
page rclm : 63622 flts: 0
# swaps : 0
blocks in : 1743520 out: 0
switch ctx: 16968 ictx: 262
Thanks again!
-
- neil_math_comp
- Member
- 1743 posts
- Joined: March 2012
- Offline
It'd be good if you could submit a bug report with an example, so that we can get a bit of an idea what's going on, or you could post something here if you'd community feedback. We may have to expose the sample cache limit option, but it may just be something unforeseen or something that can be optimized better.
It looks like none of the pixel filters are super wide, and there are only a total of 14 floats per sample from the image planes, so it's not too bad on those fronts. There must be something else going on. My wild guess is that Pixel Samples (i.e. the primary rays per pixel, defaulting to 3x3) may be very high, in which case, it can usually be lowered, (unless the noise is dominated by motion blur noise), by increasing the Ray Samples (secondary rays per hit or rays per light per hit).
It looks like none of the pixel filters are super wide, and there are only a total of 14 floats per sample from the image planes, so it's not too bad on those fronts. There must be something else going on. My wild guess is that Pixel Samples (i.e. the primary rays per pixel, defaulting to 3x3) may be very high, in which case, it can usually be lowered, (unless the noise is dominated by motion blur noise), by increasing the Ray Samples (secondary rays per hit or rays per light per hit).
Writing code for fun and profit since... 2005? Wow, I'm getting old.
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
-
- cguer
- Member
- 5 posts
- Joined: March 2014
- Offline
So I relayed the information to our artist and they were able to adjust their render settings in a way that wouldn't cap the cache limit and everything seems to be working fine now.
We'll be checking with them in the next few days to see if they would potentially need to modify the cache limit but right now it seems they were able to adjust their settings properly.
Thanks again for your help and we'll send an e-mail request if the need to bump the cache limit occurs.
We'll be checking with them in the next few days to see if they would potentially need to modify the cache limit but right now it seems they were able to adjust their settings properly.
Thanks again for your help and we'll send an e-mail request if the need to bump the cache limit occurs.
-
- EwanW
- Member
- 6 posts
- Joined: Oct. 2012
- Offline
I am getting a lot of these errors when rendering my scene at 960x408 and 10 pixel samples. We have 20 AOVs but this doesn't seem excessive compared to the example you quoted?
Houdini 14.0.249
I'll try and get a scene together to send as bug report. Is there any way of increasing the memory limit in mantra options or is this an actual physical thing on the processor?
Houdini 14.0.249
I'll try and get a scene together to send as bug report. Is there any way of increasing the memory limit in mantra options or is this an actual physical thing on the processor?
-
- neil_math_comp
- Member
- 1743 posts
- Joined: March 2012
- Offline
EwanWI suppose if all of the image planes are 4 floats, none have constant tile values, and everything else is default, that'd max out at about:
I am getting a lot of these errors when rendering my scene at 960x408 and 10 pixel samples. We have 20 AOVs but this doesn't seem excessive compared to the example you quoted
(960 pixels across) * (16 pixels vertically in a tile) * (10*10 samples per pixel) * (20 image planes) * (4 floats per plane sample) * (4 bytes per float internally)
= 468.75 MB,
so it could happen. Having 10x10 samples per pixel has a big impact on it, so if you can get away with 5x5 by increasing the direct/indirect secondary rays, that'd be 4x less memory for the sample cache. You may also be able to get it to render faster by tweaking those settings. Alternatively, you could change the render tile size to 8x8, and that should cut the sample cache memory use about in half, without affecting the final results at all.
I'll try and get a scene together to send as bug report.Thanks! That'd be useful.
Is there any way of increasing the memory limit in mantra options…?I don't think the option is exposed in any way, (it'd be vm_samplecachesize, defaulting to 256 if it is), but I put code in place so that it'd be relatively easy to add the option if it became an issue.
Are the warnings bogging down your renders too, or are they just a nuissance?
Writing code for fun and profit since... 2005? Wow, I'm getting old.
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
-
- neil_math_comp
- Member
- 1743 posts
- Joined: March 2012
- Offline
Tomorrow's build of Houdini 14.0 will have an option to set the Sample Data Cache Size (vm_samplecachesize) on the Mantra ROP, and the default is increased to 512MB (from 256MB). You'll have to add it from the parameter interface dialog box if you want to increase it further.
Sorry for not adding this sooner. I'd added it to the development build back in April, but hadn't backported it until someone filed a bug for it today.
Sorry for not adding this sooner. I'd added it to the development build back in April, but hadn't backported it until someone filed a bug for it today.
Writing code for fun and profit since... 2005? Wow, I'm getting old.
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
https://www.youtube.com/channel/UC_HFmdvpe9U2G3OMNViKMEQ [www.youtube.com]
-
- cguer
- Member
- 5 posts
- Joined: March 2014
- Offline
-
- Quick Links

