Hi all,
I have a table with several measurement fields (columns) for each row (entities). Each of the measurements has a lower bound and an upper bound that I have to check the values in the main table against. I'm not sure of the best way to do this.
These boundary values are stored in a separate table:
I've come up with one solution but I'm not particularly happy with it. It seems to be inefficient. I'm fairly new to Alteryx so I'm sure there's a better way -- there's a lot I don't know!
In my "solution" I took the boundary values table and applied two Crosstabs to separate the upper & lower boundary values and flatten each into a 1 x 3 table. I Appended these together, then Appended the resulting 1 x 6 table to my original data table, so that I could now use a Formula tool, or apply Filters, etc.
But it bothers me that I'm duplicating the set of boundary values for each row in the main table. In my real-world file I have 12,000+ rows with 83 measures to validate. That's a lot of redundancy!
I'd be delighted to hear of a better way to accomplish this task.
My workflow is attached.
Cheers,
Jennifer
Solved! Go to Solution.
Maybe try the Tile tool and manually set the boundaries to get the records you want into a single tile that you can process further as needed?
A simple solution for this is to Transpose your entity table. Group by Entity. Then you just have a generic name field that will have measure 1, 2, n in it. Then you join to the boundaries table on Name (main) = Measure (support). From there you will then get the lower and upper bound for each measure but only as two columns instead of 100s.
Your final structure will look like this:
Entity Name Value LowerBound UpperBound
A Measure 1 4.5 4 5
A Measure 2 4.3 35
A Measure 3 94 93 97
@ryan_lambert and @Joe_Mako: Your solution definitely makes the setup easier. Thank you.
I notice that this approach still stores a copy of the boundary value for each entity. In the example, there are 4 entities, so there are 4 copies of the lower bound and 4 copies of the upper bound.
I'd like to work around this if I can, since my real-life data has ~12,000 entities:
12,000 entities * 83 measures * 2 bounds/measure ≈ 2 million data points that I would be storing, when I really only need to store 166 (83 * 2).
Thanks again,
Jennifer
@gc, thanks for the suggestion. But every single record has to be validated against every single measure, so I don't see how a Tile could help. Let me know if I'm missing something?
What would you like for a end result from your sample data?
@Joe_Mako: LOL -- the really tough question! :).
I'm still working out what I want to do with the non-conforming data. Probably segregate it for review. So I think the data + the appended "truth table" is what I want for output.
I'm proceeding with the current suggestion & will review the performance. If I ditch the superfluous boundary value columns as soon as I'm done with them hopefully things won't be too bad.
Cheers,
Jennifer