Say I have 8 work_items coming into a python server.
The python server is set to static and Session Count from Upstream Items.
Now say work_item 5 failes, now my python server stops working while it waits for 5 to finish, which it will never do. The other work_items where cooked properly so I'm having to go in and jump through some crazy loops to finish the processing on the good items.
So is there a good way to handle failed jobs, I do some baking on very irregular meshes and sometimes something just doesn't compute. I want the rest of network to finish properly so I only have to fix the one issue, instead of everything after work_item 5.
1. Any smart tips on handling failed work_items would be great!
2. Can a server work serially while not wait for the proper index of work_item to hit it?
Thanks!
-Johan
Error handling while using a python server_begin/end
1863 5 2- Johan Boekhoven
- Member
- 72 posts
- Joined:
- Offline
- chrisgreb
- Member
- 603 posts
- Joined: Sept. 2016
- Offline
2. Can a server work serially while not wait for the proper index of work_item to hit it?No this is pretty fundamental to how dependencies work.
So is there a good way to handle failed jobs, I do some baking on very irregular meshes and sometimes something just doesn't compute. I want the rest of network to finish properly
If the item is failing due to an exception being raised in the server and you want this to not be fatal, you could wrap your code in a try/except and just issue a warning instead (using `work_item.cookWarning()` ).
Edited by chrisgreb - June 25, 2020 15:34:27
- Johan Boekhoven
- Member
- 72 posts
- Joined:
- Offline
Thanks Chris,
The issue arises from and HDA not being able to process the geo, so I don't think try catching would work, or do I not understand correctly. Could I make a python node following the HDA processor that handles the error?
But I won't have the data from the HDA processing down the line for a work_item to process.
I'd like to simple disable, bypass, delete the work_item all together. But then I can't be static in that part of the flow. Would such a thing be possible?
The issue arises from and HDA not being able to process the geo, so I don't think try catching would work, or do I not understand correctly. Could I make a python node following the HDA processor that handles the error?
But I won't have the data from the HDA processing down the line for a work_item to process.
I'd like to simple disable, bypass, delete the work_item all together. But then I can't be static in that part of the flow. Would such a thing be possible?
- chrisgreb
- Member
- 603 posts
- Joined: Sept. 2016
- Offline
You can install the Local Scheduler Job parm “On Task Failure” on the HDA Processor node, and set that to “Ignore”. That will make the item green even if the job fails.
Downstream of that you will want to use a filterbyexpression node to cull the items that have failed (by checking for output files for example)
Downstream of that you will want to use a filterbyexpression node to cull the items that have failed (by checking for output files for example)
- Johan Boekhoven
- Member
- 72 posts
- Joined:
- Offline
- chrisgreb
- Member
- 603 posts
- Joined: Sept. 2016
- Offline
Yes, that's one way to do it. However it's not necessary to create a second scheduler node, instead you can install the scheduler job parms on to hdaprocessor1 via the type properties window:
https://www.sidefx.com/docs/houdini/tops/schedulers.html#jobparms [www.sidefx.com]
https://www.sidefx.com/docs/houdini/tops/schedulers.html#jobparms [www.sidefx.com]
-
- Quick Links