Work Item Indices

Forums PDG/TOPs Work Item Indices

4345 13 2


gyltefors: Member; 31 posts; Joined: Feb. 2017; Offline

April 15, 2019 3:50 a.m.

Please do not set all work item indices to one, and introduce redundant items like csv_rowindex. This just makes life harder. You can't use “sort by work item index” in wait for all, because it is renamed. You have to add a separate sort node, that does nothing but restores the work item index to what it was supposed to be. And the sort node can't do much else, like reverse the order etc. like SOP sort can.

Remember, each dot on the node is supposed to represent one work item. The index of an item should of course be zero for the upper left item, one for the next etc. This is totally inconsistent now. Some nodes index the nodes in this way, other start with one, other set all items to one.

If you argue that a CSV file is just one item, then there should be only one dot on the node. If you argue that it is a partition, then make it real partition, and display it as a rectangle. But whatever you, please be consistent, and don't introduce strange exceptions where there are several items, but they all have index one.

Edited by gyltefors - April 15, 2019 05:12:01


chrisgreb: Member; 603 posts; Joined: Sept. 2016; Offline

April 15, 2019 12:58 p.m.

gyltefors
Please do not set all work item indices to one, and introduce redundant items like csv_rowindex.

If your upstream item has an index of 1, then each downstream item will also have index 1, which I think is what you are seeing. If not please let us know how to repro.

The reasoning for the csvinput index change is that it should match the convention of all the other processors, which by default calculate workitem indexes to be derived from the index of their upstream item. However it should not have changed the csvinput default behavior, so that will be fixed in the next build. (Index = csv row index by default)


gyltefors: Member; 31 posts; Joined: Feb. 2017; Offline

April 16, 2019 12:03 a.m.

I have used TOPs for a few different use cases, and I have to disagree regarding the reasoning of trying to match the upstream work item index. When I need to do matching, I always create my own attribute for clarity. Let's say I have a point cloud in SOP. I want to know the index of each point, which is $PT or @ptnum. If I need to match two different point clouds, I'll create explicit indices like @my_point_id = @ptnum+1. This clear, and leaves no room for misunderstanding. Just imagine if the point indices were unpredictable, sometimes @ptnum was 1 for each point in the cloud, just because that would somehow match the upstream node, for some particular use case, for some particular user. It would just be horrible. It is the same work item indices. If a node has 10 work items, just index them 0-9, and if some user wants some particular matching, let them create an attribute @my_work_item_id = 1, or whatever. Clarity is always better in the end.

Edited by gyltefors - April 16, 2019 00:04:41


gyltefors: Member; 31 posts; Joined: Feb. 2017; Offline

April 17, 2019 6:59 a.m.

Btw, talking about the csv input node, it does not support unicode. That is a big issue if you are not living in the US. And csv output splits a string attribute containing commas into several columns, but only when specifying tab as delimiter. (Please also check this tread: https://www.sidefx.com/forum/topic/56964/?page=1#post-277657)

Edited by gyltefors - April 17, 2019 07:06:28


gyltefors: Member; 31 posts; Joined: Feb. 2017; Offline

April 20, 2019 1:01 a.m.

Again, another issue with CSV output. Handle multiple values Add Columns with a 2D position attribute gives:

position position1 position1 position1 and so on

Edited by gyltefors - Oct. 17, 2023 01:00:51


kenxu: Member; 544 posts; Joined: Sept. 2012; Offline

April 22, 2019 2:50 p.m.

Hi gyltefors, firstly we acknowledge that some of these peripheral nodes have not received quite the level of production testing as they need, as we were more focused on the core FX workflows that would impact many more people out of the gate. However, we are working to close the gap as quickly as possible. For the first production build, both the CSV nodes and JSON nodes have received a significant amount of attention. If you take a look at the change log for csv or json, you'll see that a significant number of issues have already been resolved:

https://www.sidefx.com/changelog/?journal=17.5&categories=54&body=&version=17.5&build_0=173&build_1=234&show_versions=on&show_compatibility=on&items_per_page= [www.sidefx.com]

While that is part of the problem, the flip side of the coin is that your specific use case is not at all a simple one

. WRT the JSON node, we have taken a detailed look at your use case. Part of the issue there at least is that the json file in that case is a hierarchy that is being flattened. Entries are heterogeneous, with ids that point to each other to reconstruct the hierarchy. Even if one were to write code to solve the problem, it would not be trivial. That said, we are doing all we can to help. In addition to the hierarchical array retrieval we have already added, we are planning to add:

1) The resolve path. So if you make a query like “carparks/*/address”, we'll attach for each workitem a resolved path, so that it will read carparks/1/address and carparks/2/address etc. This will help you put things back together.

2) Supporting sub-trees for queries, where the result of the query is itself a json blob representing a sub-portion of the original json. This would allow you to hierarchically pick apart a json file.

So these would fall more into the RFE rather than the BUG bucket, and should help you get further. However, we also recommend that the form of the data be restructured on your end to make it a little easier to digest. Finally, WRT to CSV, one of the issues that was blocking you - that the table sop was not supporting csv files written by csvoutput, has been solved and made the production build.

If there are any further specific problems we can help you with, please let us know.

Edited by kenxu - April 22, 2019 17:23:32

- Ken Xu


jason_iversen: Member; 12427 posts; Joined: July 2005; Online

April 22, 2019 3:25 p.m.

In my opinion for complex cases it's (arguably) simpler to write the loader yourself. There is only so much you can expect from an UI interface such a format which can return such hugely varying data topology.

Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]


gyltefors: Member; 31 posts; Joined: Feb. 2017; Offline

April 23, 2019 1:23 a.m.

jason_iversen
In my opinion for complex cases it's (arguably) simpler to write the loader yourself. There is only so much you can expect from an UI interface such a format which can return such hugely varying data topology.

Learning Python is next on my list. I first learned HAPI, and wrote my pipeline using that. Hearing about TOPs, it seemed to be a better long term solution, so I am switching over to that. While I will eventually need to write some custom integration using Python, there are some general issues with the JSON/CSV nodes that kept popping up for different use cases, so having those ironed out while TOPs is still in early development would be very nice. Also, for standalone PDG, I expect users will start to pipe in a lot of different kinds of data, so having a basic set of flexible data retrieval nodes would likely become even more important.

Edited by gyltefors - April 23, 2019 01:23:28


jason_iversen: Member; 12427 posts; Joined: July 2005; Online

April 23, 2019 5:36 a.m.

Python should be a cakewalk compared to C++

Did you give 17.5.234+ a whirl? Did it help?

Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]


kenxu: Member; 544 posts; Joined: Sept. 2012; Offline

April 23, 2019 9:34 a.m.

there are some general issues with the JSON/CSV nodes that kept popping up for different use cases, so having those ironed out while TOPs is still in early development would be very nice. Also, for standalone PDG, I expect users will start to pipe in a lot of different kinds of data, so having a basic set of flexible data retrieval nodes would likely become even more important.

Could not agree more. It's a repeated exercise at this point of looking at use cases that we may still not be addressing well, ironing out the wrinkles there, rinse and repeat. If you're open to it, we'd be up for a periodic call to see where the remaining issues are for you and see what we could do about it. Please message me in case you're interested.

- Ken Xu


anon_user_40689665: Member; 648 posts; Joined: July 2005; Offline

April 23, 2019 4:45 p.m.

gyltefors
The JSON/CSV related nodes are, honestly speaking, totally utterly broken. They have been for weeks. And it have kept me frustrated with TOPs for weeks. And it has prevented any kind of progress in my project for… WEEKS

Try doing it with vex in sops. I'm getting a 3-second cook time when loading, reformatting, combining and filtering Four csv files with 10000 lines each… also can see what's happening via spreadsheet.


gyltefors: Member; 31 posts; Joined: Feb. 2017; Offline

April 28, 2019 12:29 a.m.

kenxu
there are some general issues with the JSON/CSV nodes that kept popping up for different use cases, so having those ironed out while TOPs is still in early development would be very nice. Also, for standalone PDG, I expect users will start to pipe in a lot of different kinds of data, so having a basic set of flexible data retrieval nodes would likely become even more important.

Could not agree more. It's a repeated exercise at this point of looking at use cases that we may still not be addressing well, ironing out the wrinkles there, rinse and repeat. If you're open to it, we'd be up for a periodic call to see where the remaining issues are for you and see what we could do about it. Please message me in case you're interested.

I tried out the last stable release, and it broke my TOPs setup. I am going back to an old release for now, and will contact you directly regarding the various issues with these nodes.

Edited by gyltefors - Oct. 17, 2023 00:59:29


jason_iversen: Member; 12427 posts; Joined: July 2005; Online

April 30, 2019 2:07 p.m.

Was your setup relying on buggy behavior, perhaps? I've had that situation before where a bug fix actually broke me. Ironic bug fix in actuality.

Edited by jason_iversen - May 1, 2019 16:28:52

Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]


kenxu: Member; 544 posts; Joined: Sept. 2012; Offline

April 30, 2019 3:34 p.m.

We did talk about this internally - we're not seeing anything obvious that would account for that - waiting for your files to see what's actually happening.

- Ken Xu

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts