DjangoBB LoFi version

Full Version: PDG Services - Accessing Log from work_item

Root » PDG/TOPs » PDG Services - Accessing Log from work_item

AdamChabane

May 27, 2024 14:46:49

Hey everyone, I'm trying to collect logs for work items cooked in Houdini Services and while I know where the logs end up I want a pythonic way of getting the log file location.

I saw this thread: https://www.sidefx.com/forum/topic/73446/ [www.sidefx.com]

But it's a few years old and I doubt it's talking about Services.

I know I can get the scheduler (for local cooking) in python and therefore the local log for a work item using pdg.Scheduler.getLogURI(work_item) but as far as I'm aware the scheduler isn't used by the Service at all.

Is there an equivalent for work_items in Services?

Thank you

AslakKS

May 27, 2024 15:30:51

Hi, having wanted this in the past I took a look.
In a python script this seems to work, to get the log paths (used a python script TOP, running on generate)
You would have to make sure that the services are actually running at that stage, I usually set these kinds of scripts to wait for all upstream items to be cooked.

sm = pdg.ServiceManager.get()
service = sm.getService("blockTest")
clients = service.clients
logPaths = []
for clientName in clients:
    client = service.client(clients[0])    
    logPaths.append(client.logPath)
print(logPaths)

AdamChabane

May 27, 2024 15:38:50

Thanks again AslakKS, is there a way to know which work_item was run on which client?

I see the clients are static and could be cooked multiple times - ideally I'd be able to grab the log before the client is cooked again.

AdamChabane

May 27, 2024 15:56:03

I guess I could write the work_item id to the log and then loop over all the logs for all clients and sort/split the data into a dictionary by work_item but this feels kinda disgusting...

AslakKS

May 27, 2024 18:32:27

I think you're correct, you would need to add marks to the log files or do something like this:
As long as you're running in a "Houdini Service Block" you can at the start of the loop store the current number of lines in the log.
Then at the end of the block extract the lines from the stored line number to the end of the file (need to reset the client make it write to the log).

Script at start of loop:

import os
import sys
import re

command_string = " ".join(sys.argv)
work_item.setStringAttrib("command_string", command_string)
# Parse command string to extract --logfile {path}
log_file_match = re.search(r"--logfile\s+(\S+)", command_string)
if log_file_match:
    log_file = log_file_match.group(1)

    work_item.setStringAttrib("log_file", log_file)
    if os.path.isfile(log_file):
        with open(log_file) as f:
           current_line = sum(1 for _ in f)
        work_item.setIntAttrib("log_start", current_line)
    else:
        work_item.setIntAttrib("log_start", 1)

Script at end of loop (With the parm "Reset Service" set to "Reset Client" - "Before Cook"):

from itertools import islice

log_start = work_item.intAttribValue("log_start")
with open(work_item.stringAttribValue("log_file")) as f:
    lines = list(islice(f, log_start+2, None)) # Skip the first two lines

work_item.setStringAttrib("log_contents",'\n'.join(lines))

Attaching my mock-up scene

tpetrick

May 28, 2024 16:15:28

For tasks that run on services there isn't a separate log file per work item -- there's a single log for the whole service client. However, the individual log data for work items that ran on a service client is available through the pdg.WorkItem.logMessages property, which is basically just a string buffer containing the log data that was produced while that work item ran. That's also where log data is written for any in-process work items that call addMessage, addWarning or addError as part of their script code.

You can determine the name of the service client that ran a particular task using the pdg.WorkItemStats object returned from the pdg.WorkItem.stats(..) method. That object contains the various cook time durations for the work item, as well as the name of the service client that ran the work item if applicable:

stats = work_item.stats()
print("client = {}".format(stats.serviceClient))
print("service = {}".format(work_item.node.serviceName))

AslakKS

May 28, 2024 17:06:10

Ah! This is excellent - this makes trivial to get the logs.
I tried it with a "ROP Geometry" top running with services + a python script that reads from the "parent_item", and it just works \

AdamChabane

June 1, 2024 18:12:41

Fantastic!

Apologies I didn't realise I could reference an out-of-process work_item while still in a Service Block.

I am using

pdg.WorkItem.parent

to get the in parent in-service work item and then

.logMessages

from an in-process script, which is allowing me to pass thorough the logs without worrying about on disk log files.

Thank you both!