Hello - I am new to creating iterative macros and am looking for some help. I have the following workflow below which is somewhat of a "custom clustering" workflow to group similar part numbers (based on the similarity of their subcomponents). The input file has 3 columns. First column is a Part Number. Second Column is the Part Number that it has been compared to (in previous workflows), and the 3rd column is the "percent similarity" of the first two columns. The same part numbers will be in both the first and second column as "everything is being compared to everything".
The workflow I have created:
1. Filters for similarities greater than 90%
2. Counts the number of 90% similar parts per part (if that makes sense)
3. Sorts from greatest to smallest and takes the first value (lets call it Part A)
4. Then joins back to the original data set to see if there are any parts that may not be greater than 90% similar to Part A but are 90% similar to one of Part A's similar parts
5. Then I go through the cleansing steps to get unique part numbers into a grouping
6. I want to iteratively add these groupings with an added "grouping number" column that is Iteration Number + 1
7. Then I take these unique parts and filter them out of the source data set since they are "already accounted for" and loop back through
8. Stop when Count of Records of Iteration Number = Count of Records of Iteration Number-1 (basically no more parts are being put in to groups)
Any help that could be provided would be great!
Thanks,
Kyle
Just wanted to check back in to see if anyone had any thoughts on this.
Hi @muddobber26 ,
I understant you want to do some cluster analysis.
If so, Network Analysis tool may come in handy.
Workflow
You may change the parameter in Filter tool to change the threshold of "similarity".
Output (Interactive Chart)
Output (Data)
For details on Network Analysis tool, search the community and you will find some good articles like below:
Good luck.
Thanks so much! Will take a look.