Troubleshooting PDG on the farm (HQueue, Deadline, Tractor)

   3996   10   3
seelan
Member
571 posts
Joined: May 2017
Offline
Check this document which covers general issues:
https://www.sidefx.com/docs/houdini/tops/farm_troubleshooting.html [www.sidefx.com]
User Avatar
Member
37 posts
Joined: Nov. 2018
Offline
Hi, we are using HQueue on the farm and everything works fine with HQrender and HQsim but when we are using hqueuescheduler something goes wrong and on the client nodes we have this error ‘“C:\Program’ is not recognized as an internal or external command, operable program or batch file.
I think is related to not using spaces in paths, so we tried to insert quotes into the ”Universal HFS" but no luck

Could someone help us to tackle this?
User Avatar
Member
603 posts
Joined: Sept. 2016
Offline
You could try changing the Python parm on HQueue Scheduler node. Otherwise attach your job diagnostics file so we can take a look.
User Avatar
Member
5 posts
Joined: March 2020
Offline
Greetings SideFX crew!

This issue in unrelated to the prior post but still in the same vein of the theme heading “Debugging PDG on the Farm”.

I'm trying to get HQueue working with PDG and I've hit a wall.
By playing with the hqueuescheduler node, I've been able to get the ropgeometry node to generate tasks but that is as far as it gets.

The job never gets to the server as all tasks immediately fail. Each task info simply reports, “State Failed”.
State Failed Index 4 Frame 1.0 Priority 0 Command "\FIEA-CAPSTONES\HQS\houdini_distros\hfs.windows-x86_64\bin\hython.exe" "//FIEA-CAPSTONES/HQS/projects/PDGQuickTest/pdgtemp/10332/scripts/rop.py" -p "//FIEA-CAPSTONES/HQS/projects/PDGQuickTest/PDGQuickTest.hipnc" -n "/obj/topnet1/ropgeometry1/ropnet1/geometry1" -to "/obj/topnet1/ropgeometry1" -i "ropgeometry1_ropfetch1_1_179" -fs 1 -fe 1 -fi 1 Expected Output __PDG_DIR__/geo/PDGQuickTest.ropgeometry1.4.4.bgeo.sc file/geo hip $HIP/PDGQuickTest.hipnc file/hip outputparm sopoutput rop /obj/topnet1/ropgeometry1/ropnet1/geometry1 top /obj/topnet1/ropgeometry1 wedgeattribs [2] curlblend, maxcurve wedgeindex 4 [ Exported ] wedgenum 4 [ Exported ] curlblend 0.0 [ Exported ] frame 1.0 maxcurve 10.0 [ Exported ] range [3] 1.0, 1.0, 1.0 I get no response when trying to access “Showlog” in each task info.

The error in the hqueuescheduler note reports
Error Failed to start the Message Queue Job http://fiea-capstones:5000/jobs/view/313 CookError: HQueue reported failure Error Root job is invalid
I'm not sure what “Root job” in the error refers to. I think this may be the culprit but I don't understand how to change.

Thanks in advance for any help!
User Avatar
Member
7 posts
Joined: Aug. 2016
Offline
I had the same situation last week, Hqueue works in ROP, but not in TOP.

The HQueue Scheduler has a button that says Load from HQueue, does this work as expected?

If it doesn't work, open the HQueue console in your browser and go to the Network Folder page.

Then, set the correct path and leave Project Variable as HQROOT.

My environment has worked this way.

I hope it helps.
User Avatar
Member
37 posts
Joined: Nov. 2018
Offline
chrisgreb
You could try changing the Python parm on HQueue Scheduler node. Otherwise attach your job diagnostics file so we can take a look.

After a few messages @chrisgreb found the error, thank you for all the support

chrisgreb
“This is because the windows command shell doesn't work with spaces in the name, you can use $HFS or you can use the Windows short name. Find that out by doing this for example in a CMD shell:

cmd /c for %A in (“C:\Program Files\Side Effects Software\Houdini 18.0.388”) do @echo %~sA

C:\PROGRA~1\SIDEEF~1\HOUDIN~1.388”

Now it's working
User Avatar
Member
4 posts
Joined: Aug. 2018
Offline
Does this function work properly in Houdini18 ? (linux)

Edited by cream121314 - April 20, 2020 14:45:16

Attachments:
2020-04-20-16:34:13.png (31.1 KB)

User Avatar
Member
603 posts
Joined: Sept. 2016
Offline
Oh sorry, that's been changed in 18 because the automatic behavior was causing confusion. Instead you should use the “Tag” Job-Parm for HQueue. We'll fix the docs.
User Avatar
Member
4 posts
Joined: Aug. 2018
Offline
thank you !
User Avatar
Member
5 posts
Joined: March 2020
Offline
Thanks everyone!
You all rock!

I put on my snorkel and fins and did a deep dive. I found what I had to do in order to get the network working on the queue.
Success! … sort of

The network executed successfully until I hit my ‘ROP Geometry Output’.
The first node in the ‘ROP Geometry Output’ is the ‘File’ node.
The first parameter of the ‘File’ is @pdg_input (I also discovered that pdginput(0, ‘file/geo’, 0) produces the same results).

This node fails, unable to open the input path.
The path is actually missing a ‘\’ in the front of the string.
While I eventually worked around this by hardcoding the path into the parameter, I still would like to know what I am doing wrong and how can I get around this.

Enclosed in the Output and the Diagnostic.
Thanks again for all your help. You guys are the best community in the world!
Image Not Found
Image Not Found
Edited by ProfessorChris - April 23, 2020 16:12:46

Attachments:
job_4794_diagnostic_information.txt (7.3 KB)
job_4794_output.txt (1.1 KB)

User Avatar
Member
603 posts
Joined: Sept. 2016
Offline
There's something strange with your HQROOT config. It seems to indicate ‘localhost’ as the host of the network folder which isn't right. You should take a look at the “Network Folders” section of the HQ UI:

https://www.sidefx.com/docs/houdini/hqueue/networkfolders.html [www.sidefx.com]
  • Quick Links