Tractor farm error with 18.5

   1427   1   2
User Avatar
Member
27 posts
Joined: March 2020
Offline
We have some code that automates selecting and triggering the submission of scheduler nodes to our Tractor farm. It has worked with 17.5 and 18.0, but with 18.5 the farm tasks spawned by the main PDG Cook task fail when Tractor tries to launch them, with hython complaining it can't find "<pdg_workingdir>/67890/pdgtemp/scripts/pdgjobcmd.py". "<pdg_workingdir>" is the path we supply to the scheduler.

With 17.5 and 18.0, Houdini put a copy of the .hip file and a "pdgtemp" dir into "<pdg_workingdir>", and inside "pdgtemp" were two directories with arbitrary 5-digit numbers as names, say "12345" and "67890". Both of these had a "scripts" subdirectory with "pdgjobcmd.py".

With 18.5, the two numbered subdirs are now at the main level of "<pdg_workingdir>". Both have a "pdgtemp" subdirectory, but only one ("12345") has the "scripts" subdirectory. Either your farm task code is writing the wrong numbered subdir into the hython command, or the scheduler setup isn't copying the "scripts" subdirectory into that numbered subdir as it does with the other one?

I'm going to have to dig back into that stuff again, but as far as I remember, all of that would be controlled by Houdini code given the "<pdg_workingdir>" we supply it, and nothing we'd be doing in configuring the scheduler node should be able to cause this sort of breakage? We're using 18.5.408 (submit from Windows, Python2, farm is Centos), and I also tried the current production build 18.5.462 on the submit machine.
User Avatar
Member
603 posts
Joined: Sept. 2016
Offline
18.5 uses $HHP to find scripts it needs to copy, you should verify that $HHP is in your Houdini environment. It's added by the houdini_setup scripts.
  • Quick Links