Hi, I'm working on a workflow and my solution works, but it takes too long. I was hoping somebody would suggest a better approach.
I have three lists of variable. The information in these lists could be duplicated. Each variable has a key assigned to it (also not unique). I also have a master dataset where I need to find all combinations of these variables only when their ID's match.
I'm using append to create all combinations, but for just a third of actual data it takes me 15 minutes to run.
For simplicity, in my example below I will only use two sets of variables
List one: ID Variable1
1 A
1 B
2 B
List two: ID Variable2
1 c
1 d
2 e
In my dataset, where I have variables 1 and 2 as fields, I need to find the following combinations (I'll write ID's in the parentheses):
Variable 1 Variable 2
(1) A (1) c
(1) A (1) d
(1) B (1) c
(1) B (1) d
(2) B (2) e
I don't need a combination of A/e because their ID's never match.
Like I said, I use append to get the combinations and then the filter tool to remove ones where ID's don't match, but got more than million combinations.
What other tool/logic could help me?
Solved! Go to Solution.
OF COURSE!
Thank you so much, Join by ID works just like Append did, but runs 5 times faster and no need to filter out unneeded combinations.