Error handling while using a python server_begin/end

Forums PDG/TOPs Error handling while using a python server_begin/end

1843 5 2


Johan Boekhoven: Member; 72 posts; Joined:; Offline

June 25, 2020 2:51 p.m.

Say I have 8 work_items coming into a python server.
The python server is set to static and Session Count from Upstream Items.

Now say work_item 5 failes, now my python server stops working while it waits for 5 to finish, which it will never do. The other work_items where cooked properly so I'm having to go in and jump through some crazy loops to finish the processing on the good items.

So is there a good way to handle failed jobs, I do some baking on very irregular meshes and sometimes something just doesn't compute. I want the rest of network to finish properly so I only have to fix the one issue, instead of everything after work_item 5.

1. Any smart tips on handling failed work_items would be great!
2. Can a server work serially while not wait for the proper index of work_item to hit it?

Thanks!
-Johan


chrisgreb: Member; 603 posts; Joined: Sept. 2016; Offline

June 25, 2020 3:34 p.m.

2. Can a server work serially while not wait for the proper index of work_item to hit it?

No this is pretty fundamental to how dependencies work.

So is there a good way to handle failed jobs, I do some baking on very irregular meshes and sometimes something just doesn't compute. I want the rest of network to finish properly

If the item is failing due to an exception being raised in the server and you want this to not be fatal, you could wrap your code in a try/except and just issue a warning instead (using `work_item.cookWarning()` ).

Edited by chrisgreb - June 25, 2020 15:34:27


Johan Boekhoven: Member; 72 posts; Joined:; Offline

June 25, 2020 4:13 p.m.

Thanks Chris,

The issue arises from and HDA not being able to process the geo, so I don't think try catching would work, or do I not understand correctly. Could I make a python node following the HDA processor that handles the error?
But I won't have the data from the HDA processing down the line for a work_item to process.
I'd like to simple disable, bypass, delete the work_item all together. But then I can't be static in that part of the flow. Would such a thing be possible?


chrisgreb: Member; 603 posts; Joined: Sept. 2016; Offline

June 25, 2020 4:45 p.m.

You can install the Local Scheduler Job parm “On Task Failure” on the HDA Processor node, and set that to “Ignore”. That will make the item green even if the job fails.

Downstream of that you will want to use a filterbyexpression node to cull the items that have failed (by checking for output files for example)


Johan Boekhoven: Member; 72 posts; Joined:; Offline

June 26, 2020 2:26 a.m.

That makes sense, thanks for that!
And would that look something like this?

Or do I need to do something else?

Thanks!
-Johan

Edited by Johan Boekhoven - June 26, 2020 02:38:24

Attachments:
top_errorhandling.png (131.7 KB)


chrisgreb: Member; 603 posts; Joined: Sept. 2016; Offline

June 26, 2020 9:10 a.m.

Yes, that's one way to do it. However it's not necessary to create a second scheduler node, instead you can install the scheduler job parms on to hdaprocessor1 via the type properties window:

https://www.sidefx.com/docs/houdini/tops/schedulers.html#jobparms [www.sidefx.com]

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts