Well i am kinda confused as far as rendering with mantra and when i have simple scene like a sphere with spot light, no shadow, which takes around 30 seconds to render the scene, plus to load window, around 10 seconds. I am used to see when i render with mentalray or simple maya software render, it takes a second to render the simple scene.
So can someone explain me why is it so slow?
Mantra slow
10633 15 3- xcomb
- Member
- 74 posts
- Joined: March 2006
- Offline
- goldfarb
- Staff
- 3463 posts
- Joined: July 2005
- Offline
- jason_iversen
- Member
- 12652 posts
- Joined: July 2005
- Offline
Are you rendering with PBR? If you are, a comforting fact is that it doesn't really slow down with quite heavy geometry. In a sense, for PBR, a simple scene is the worst case, speed-wise- at least according to my experience. Rather use your full character/scene and you might be more pleased with the performance.
If you're not rendering with PBR, what are you doing/how are you rendering?
As for MPlay taking 10 secs to pop up; I don't get that at all (on SUSE or FC4 Linux). It takes about 4-5 secs the first time, but once it's cached, it only takes a second. What is your configuration/setup?
Final question: what version are you using? If you're using an older version on Windows, there was an optimization for rendering to MPlay (by sending multiple tiles to MPlay at once- to counter Windows' crap network transfer speeds). Also, perform another quick test to see how fast it renders directly to a file on disk rather than Mplay.
Hope this helps,
Jason
If you're not rendering with PBR, what are you doing/how are you rendering?
As for MPlay taking 10 secs to pop up; I don't get that at all (on SUSE or FC4 Linux). It takes about 4-5 secs the first time, but once it's cached, it only takes a second. What is your configuration/setup?
Final question: what version are you using? If you're using an older version on Windows, there was an optimization for rendering to MPlay (by sending multiple tiles to MPlay at once- to counter Windows' crap network transfer speeds). Also, perform another quick test to see how fast it renders directly to a file on disk rather than Mplay.
Hope this helps,
Jason
Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]
also, http://www.odforce.net [www.odforce.net]
- xcomb
- Member
- 74 posts
- Joined: March 2006
- Offline
Yes i've thought about using more complex scenes in stead of using simple sphere to see the result.
Here is my spec, which i don't thing is the issue.
WinXP 64 bit
Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz 3.50 GHz, 4 GB Ram
NVIDIA Quadro FX 1500
I am using newer version update
I am using PBR and i've set the sampling and all the rest settings to very low, still the same.
Here is my spec, which i don't thing is the issue.
WinXP 64 bit
Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz 3.50 GHz, 4 GB Ram
NVIDIA Quadro FX 1500
I am using newer version update
I am using PBR and i've set the sampling and all the rest settings to very low, still the same.
- jason_iversen
- Member
- 12652 posts
- Joined: July 2005
- Offline
Everything sounds like a go for decent render speed on that machine to me. So yeah, try render with a more complex scene, I'd say.
Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]
also, http://www.odforce.net [www.odforce.net]
- JasonSlab
- Member
- 1535 posts
- Joined: March 2020
- Offline
hey
i've also found mantra to be slow on simple scenes, but yeah it doesn't take much longer on a slightly heavier scene.
one of the big issues i'm having is multi threading. My new work machine is a dual quad core, now the tests iv'e done on simple and heavy scenes, mantra slows down to a crawl if i use more than three cores.
rather odd, check out the rendering times on a meduim size scene below:
all threads 00:02:38:shock:
one thread 00:01:04
two threads 00:00:42
three threads 00:00:37
four threads 00:00:48
this was with micropolygon, iv'e tried raytracing to0, it pretty much comes out with the same result
running h 9.0.762 on windows 64
i also tested it on 9.0.759
jason
i've also found mantra to be slow on simple scenes, but yeah it doesn't take much longer on a slightly heavier scene.
one of the big issues i'm having is multi threading. My new work machine is a dual quad core, now the tests iv'e done on simple and heavy scenes, mantra slows down to a crawl if i use more than three cores.
rather odd, check out the rendering times on a meduim size scene below:
all threads 00:02:38:shock:
one thread 00:01:04
two threads 00:00:42
three threads 00:00:37
four threads 00:00:48
this was with micropolygon, iv'e tried raytracing to0, it pretty much comes out with the same result
running h 9.0.762 on windows 64
i also tested it on 9.0.759
jason
- xcomb
- Member
- 74 posts
- Joined: March 2006
- Offline
- jason_iversen
- Member
- 12652 posts
- Joined: July 2005
- Offline
I'm not terribly surprised you might get such inverse results -and I'm not saying it can't be improved (I don't know) - because Mantra has to do some synchronization and bookkeeping for each thread and when you have a render as fast as 40 seconds, the overhead for managing the threads plays a larger role than the benefit of their lighting fast buckets they're rendering.
I would think that the overhead per thread might be a relatively fixed cost, per tile. So what you can do yourself is probably increase your bucket size to something very big - like 64 or 128 or more and let each thread have more to chew on?
Perhaps there are some optimizations that SESI can consider for very light scenes to avoid threading at all if, say, there is a lot of empty space or something; or a handle multiple buckets/adaptive bucket size - but they probably do all this anyway.
I would think that the overhead per thread might be a relatively fixed cost, per tile. So what you can do yourself is probably increase your bucket size to something very big - like 64 or 128 or more and let each thread have more to chew on?
Perhaps there are some optimizations that SESI can consider for very light scenes to avoid threading at all if, say, there is a lot of empty space or something; or a handle multiple buckets/adaptive bucket size - but they probably do all this anyway.
Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]
also, http://www.odforce.net [www.odforce.net]
- mark
- Staff
- 2641 posts
- Joined: July 2005
- Offline
jason|slab
hey
i've also found mantra to be slow on simple scenes, but yeah it doesn't take much longer on a slightly heavier scene.
one of the big issues i'm having is multi threading. My new work machine is a dual quad core, now the tests iv'e done on simple and heavy scenes, mantra slows down to a crawl if i use more than three cores.
rather odd, check out the rendering times on a meduim size scene below:
all threads 00:02:38:shock:
one thread 00:01:04
two threads 00:00:42
three threads 00:00:37
four threads 00:00:48
this was with micropolygon, iv'e tried raytracing to0, it pretty much comes out with the same result
running h 9.0.762 on windows 64
i also tested it on 9.0.759
jason
A couple of quick questions:
1) Are you measuring CPU time or wall-clock time?
2) Are you rendering to mplay or a disk file?
With quick renders, the bottleneck can often be I/O. I/O is performed in sequence, meaning only one thread can write to the output at one time. Other threads will spin while waiting for the I/O thread. Which might account for an increased CPU time, but shouldn't increase wall clock time.
However, your renders are around 40 seconds, so there shouldn't be that much I/O blocking. Unless there's an issue with sending the data to mplay.
When there are threading bottlenecks, CPU architecture can have a huge difference. Here are some wall clock timings on a very simple scene I tested on three different machines.
Bucket Size 64:
Core2Duo 1 Proc: 12.7s
Core2Duo 2 Proc: 11.2s
4x Xeon 1 Proc: 16s
4x Xeon 2 Proc: 13.85s
4x Xeon 4 Proc: 17.4s
2x Core2Duo 1 Proc: 8.12s
2x Core2Duo 2 Proc: 4.86s
2x Core2Duo 4 Proc: 3.95s
So, the Xeon had more issues with I/O than the Dual Core2Duo
Bucket Size 16:
Core2Duo 1 Proc: 6.45s
Core2Duo 2 Proc: 3.87s
4x Xeon 1 Proc: 11.75s
4x Xeon 2 Proc: 10.57s
4x Xeon 4 Proc: 14.49s
2x Core2Duo 1 Proc: 5.56s
2x Core2Duo 2 Proc: 3.3s
2x Core2Duo 4 Proc: 2.2s
In this case, decreasing the bucket size resulted in less contention for I/O. Render times doubled on the Core2Duo's and were significantly faster on the Xeons.
But, really, you want to be testing heavier scenes. Scenes where mantra does a lot of processing per tile, otherwise you're testing I/O performance…
Edit: Note, I just realized that the Xeon is hyperthreaded, which accounts for it getting slower with 4 threads.
- JasonSlab
- Member
- 1535 posts
- Joined: March 2020
- Offline
- peliosis
- Member
- 175 posts
- Joined: July 2005
- Offline
Hey guys,
I was a bit offhoudi last year.
But I'm constantly following every small step of sesi hahaa.
Out of curiosity I installed H9, and loaded my current project geo 600 k of polys in one object. A settlement.
It turns out that eg in ambient occlusion mantra is faster than my basic vray setup!
What Da Hell? Still this scene is very basic but hey, last time it was lagging far behind :twisted:
Does anybody have the same experience, perhaps with other features like full blown GI? Glass rendering or something?
I'll do more excessive tests but not in next 5 mins
:roll:
I was a bit offhoudi last year.
But I'm constantly following every small step of sesi hahaa.
Out of curiosity I installed H9, and loaded my current project geo 600 k of polys in one object. A settlement.
It turns out that eg in ambient occlusion mantra is faster than my basic vray setup!
What Da Hell? Still this scene is very basic but hey, last time it was lagging far behind :twisted:
Does anybody have the same experience, perhaps with other features like full blown GI? Glass rendering or something?
I'll do more excessive tests but not in next 5 mins
:roll:
- digitallysane
- Member
- 1192 posts
- Joined: July 2005
- Offline
peliosisDon't have any experience with Vray but I can say that I had to use Mantra 9 in production with GI (including rendering 6K images with PBR for a poster) and is very usable.
Does anybody have the same experience, perhaps with other features like full blown GI? Glass rendering or something?
I'll do more excessive tests but not in next 5 mins :roll:
Dragos
- digitallysane
- Member
- 1192 posts
- Joined: July 2005
- Offline
markCan you please elaborate a little on those times? What kind of Xeons are those? The “new” Core2Duo versions or the older ones?
Bucket Size 64:
Core2Duo 1 Proc: 12.7s
Core2Duo 2 Proc: 11.2s
4x Xeon 1 Proc: 16s
4x Xeon 2 Proc: 13.85s
4x Xeon 4 Proc: 17.4s
2x Core2Duo 1 Proc: 8.12s
2x Core2Duo 2 Proc: 4.86s
2x Core2Duo 4 Proc: 3.95s
If I compare two workstations here (one with Core2Duo and another with 2 x Xeon Core2Duo 5130, the Xeon system is significantly faster–like 2x, which is what I'd actually expect)
Dragos
- mark
- Staff
- 2641 posts
- Joined: July 2005
- Offline
digitallysanemarkCan you please elaborate a little on those times? What kind of Xeons are those? The “new” Core2Duo versions or the older ones?
Bucket Size 64:
Core2Duo 1 Proc: 12.7s
Core2Duo 2 Proc: 11.2s
4x Xeon 1 Proc: 16s
4x Xeon 2 Proc: 13.85s
4x Xeon 4 Proc: 17.4s
2x Core2Duo 1 Proc: 8.12s
2x Core2Duo 2 Proc: 4.86s
2x Core2Duo 4 Proc: 3.95s
If I compare two workstations here (one with Core2Duo and another with 2 x Xeon Core2Duo 5130, the Xeon system is significantly faster–like 2x, which is what I'd actually expect)
Dragos
The first Core2Duo (total of 2 processors):
/proc/cpuinfo
…
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
…
The Xeons
/proc/cpuinfo
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.00GHz
The second Core2Duo
/proc/cpuinfo
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU @ 2.66GHz
However, I believe that those Xeon's were hyperthreading virtual processors as cpuinfo also reports that siblings is 2, but has 1 core. I don't actually know the physical configuration of the Xeon machine though… Just what cpuinfo tells me.
Hope that helps.
- digitallysane
- Member
- 1192 posts
- Joined: July 2005
- Offline
I suppose those are older (pentium4-based) Xeons. The current generation of Xeons is based on the Core architecture. On my workstation (2x Dual core Xeon), CPU info reports
Processor: Intel Core 2 Duo ??? (the ??? is from CPU Info)
CPU Name: Intel(R) Xeon(R) CPU 5130@2GHz
CPU Family: 6
Model: F
So those benchmarks are not very relevant since you are comparing very different architectures, the Core architecture being much more advanced than Pentium 4's. I'm not very sure about bandwidth, but for pure processing speed the Core architecture is much, much faster so that could explain the results of Core 2 Duo vs Xeon in the benchmarks.
Dragos
Processor: Intel Core 2 Duo ??? (the ??? is from CPU Info)
CPU Name: Intel(R) Xeon(R) CPU 5130@2GHz
CPU Family: 6
Model: F
So those benchmarks are not very relevant since you are comparing very different architectures, the Core architecture being much more advanced than Pentium 4's. I'm not very sure about bandwidth, but for pure processing speed the Core architecture is much, much faster so that could explain the results of Core 2 Duo vs Xeon in the benchmarks.
Dragos
- JasonSlab
- Member
- 1535 posts
- Joined: March 2020
- Offline
hey, sorry for the late reply!
here is some updated tests, i'm still getting slowdowns with different bucket sizes. I've used a much heavier geometry scene for my latest tests with some displacement and existing shadow maps.
here's the results:
bucket 16 raytrace
cores x2 4:25
cores x8 4:40
bucket 64 raytrace
cores x2 4:27
cores x8 4:16
bucket 16 micropolygon
cores x2 1:18
cores x8 3.58
bucket 64 micropolygon
cores x2 1:16
cores x8 4:17
avarage mantra memory use 1.9gig VM 1.89gig
rendering on:
dual Xeon quad core x5365 cpu @ 3.0GHz
XP 6bit
houdini 9.0.762
jason
here is some updated tests, i'm still getting slowdowns with different bucket sizes. I've used a much heavier geometry scene for my latest tests with some displacement and existing shadow maps.
here's the results:
bucket 16 raytrace
cores x2 4:25
cores x8 4:40
bucket 64 raytrace
cores x2 4:27
cores x8 4:16
bucket 16 micropolygon
cores x2 1:18
cores x8 3.58
bucket 64 micropolygon
cores x2 1:16
cores x8 4:17
avarage mantra memory use 1.9gig VM 1.89gig
rendering on:
dual Xeon quad core x5365 cpu @ 3.0GHz
XP 6bit
houdini 9.0.762
jason
-
- Quick Links