synchronization of sim/render processes

Forums Technical Discussion synchronization of sim/render processes

1900 3 0


carstenk: Member; 84 posts; Joined: 1月 2013; Offline

2015年4月21日 18:28

hi,
I have a question regarding synchronization of processes running on different machines.

suppose we have:
- simA, 100 frames, 5 minutes a frame
- simB, 100 frames, 4 minutes a frame

simB is directly dependent on the output of simA
Now I could run simA on one box to its finish and then run simB, total wait time 900 minutes.
But I would like to run simA and simB in parallel, simB just slightly staggered behind (assuming simB only needs an additional frame from simA), so that it never overtakes simA. total wait time 504 minutes. (win of nearly 400 minutes, obviously at the potential expense of licenses and machine power)

the same problem expands to lots of other problems (for example rendering the result of a sim while the simulation is still going).

a slightly more complicated example would be simA clustered into 10 different subSims, and me wanting to advance a frame on simB once all those 10 sub-chunks for the frame have completed.

is there an example of how can I set something like this up?

cheers!
carsten

carsten kolve - ds @ image engine


jlait: スタッフ; 6239 posts; Joined: 7月 2005; Offline

2015年4月21日 19:33

The Net Barrier ROP can do this.

It requires you setup a tracker, point sim A & B to it.

Sim A will post its completed frame as its wait value
Sim B will use that as a wait value.

A gotcha is that if you have background writes on, Sim B may not see the file yet.

Similarily, sometimes filesystems like NFS can take a few seconds for files to show up that throws off attempts at precise synchronization.

Another approach may be to use the post-write script to touch a FRAME_$F_DONE file. Then sim_B can have a pre-render script that is a python loop waiting for that file to show up with a sleep(1).


carstenk: Member; 84 posts; Joined: 1月 2013; Offline

2015年4月23日 17:23

Thanks, that is very helpful. A problem we are facing is that our farm submission/dependency software (not hqueue ) is not supporting the NetBarrierRop. Your idea of pre-frame sounds cool for specific sim setups, but I am wondering about a more generic solution, that allows one to keep track of the processes, restart, re-queue etc, all in a situation where you are basically limited to sequential and independent parallel processes.

wondering about chunking the execution of the simA into say 20 (10, 5,1) frames, at a time (each process taking a .sim file from the previous one as the initial state). then the execution of simB chunks could be triggered based on the finishing of the corresponding simA chunk

simA_1.5
| -> simB_1.5
simA_2.5 |
| -> simB_2.5
simA_3.5 |
| -> simB_3.5
simA_4.5 |
| -> simB_4.5
simA_5.5 |
| -> simB_5.5

you'd pay extra for run-up costs (starting houdini etc until it can continue the sim). one could adjust the chunk size based on the speed of simA and simB - which could result in a better license utilization (especially if simB is fast compared to simA)

wondering if anyone has experimented with setups like this?

carsten kolve - ds @ image engine


jlait: スタッフ; 6239 posts; Joined: 7月 2005; Offline

2015年4月23日 17:52

If you have got your farm software to work with the other tracker-based distirbution, it is mostly the same setup for the netbarrier.

Start the tracker as a 0-cpu job, record its location.
Launch simA and simB as a paired job (neither starts until both start), give them the tracker.
Kill the tracker when A & B are done.

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts