Q: Best way to "instakill" TOP work items

Forums PDG/TOPs Q: Best way to "instakill" TOP work items

1438 3 1


pbowmar: Member; 7025 posts; Joined: July 2005; Offline

March 29, 2021 4:05 p.m.

Hi,

So I'm working with very high res geometry and can easily accidentally overload a 128gb machine's RAM. Obviously, I am working to not do this

However in the meantime, what is the best way to "instakill" everything in TOPs? If I click Cancel Current Cook, the Hython processes still seem to want to finish their current work item, which often leads to swapping in my case.

I'm currently attempting to use the Task Manager in Windows to kill the processes but that seems ugly

plus it's slow if I happen to have several hython.exe processes due to having several slots.

I've tried both the Service and without Service, though I prefer Service because speed of work items being picked up of course.

18.5.462 Win 10 Py 3

Cheers

Peter B


pbowmar: Member; 7025 posts; Joined: July 2005; Offline

March 29, 2021 4:13 p.m.

I just realised I'm basically looking for a render farm's "boot job" vs "drain" i.e. sometimes I want the processes to stop _now_ regardless of their progress, and other times I do want the work item to finish what it's doing but not pick up any more work items.

This is localscheduler btw, if I'm using Deadline or something similar I can go to that scheduler's UI and I'll have Boot vs Drain there.

Cheers,

Peter B


tpetrick: Staff; 585 posts; Joined: May 2014; Offline

March 30, 2021 5:48 p.m.

Are you seeing that behavior when canceling all types of work items, or is it only specific TOP nodes that have that issue?

It's possible that the problem is being caused by some sort of platform-specific behavior. For me on Linux, canceling a cook with the local scheduler immediately kills any running processes. On Windows the local scheduler uses the taskkill system utility when ending a running process. It passes in the force (/F) flag which is supposed to immediately end the process as well.

Service jobs are currently not interrupt-able, so an actively cooking item will run until completion. The service worker processes themselves persist even after the cook is canceled so they can be reused -- those aren't killed when a cook is canceled from TOPs. We can probably add a more aggressive cancel that also kills any service worker processes, or alternatively an option on the service definition that controls whether or not the workers are killed when canceling a cook.


pbowmar: Member; 7025 posts; Joined: July 2005; Offline

March 31, 2021 10:48 a.m.

Hmm good question, I will pay more attention to the node type when it happens again.

I suspect though that it's the Service jobs that are the ones that I notice most, since I do like the services

The option to kill the service on canceling the cook might be a good one, defaulting to Off since what I was doing was a bit of an edge case in terms of memory consumption. Or yeah a "Ultra-Kill" button (please call it that) that just goes nuts and kills of all the things.

I will experiment with having the potentially troublesome TOP nodes _not_ use the service however, which may give me best of both worlds.

Cheers,

Peter B

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts