Hi,
I have data of people, and each one have done some actions. In case there are two people that have done the same or similar actions, list these actions and group them into category 1. Then, check what actions are left, and see if there are two people that have done similar actions, and list these actions and group them into category 2. Do the same thing again and again to the actions that are left and group them into category 3, 4, 5, ...
I have done a sample workflow (attached), but I have few concerns:
1-The workflow is not dynamic. It will do the job only if there are three groups or less. In the actual data there might be thousands of groups.
2-The workflow is not optimized. It joins the inputs with itself. In the case of the sample, it is fine, it won't take too long, but the actual data have 25M records, so the performance will be very slow.
Thanks in advance,
Solved! Go to Solution.
Hi,
Thanks a lot. It is dynamic indeed, but when I tried it on the actual data, it took forever to perform the Cartesian join until I stopped the workflow. I can't think of a way to do it without the Cartesian join though.