Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to cluster/group certain data in Alteryx

andrefm
5 - Atom

Hi All,

I have a table which contains a list of items and value representing certain relation between those items. I would like to know if there is a way to generate n clusters (groups) based on such type of data.

 

The data below is just a small sample, but my real list contain over 100 items and 1400 associations and I would like to create for example 10 to 15 groups or clusters based on a specific value that associate the items.

 

2021-09-01 13_11_30-New Aisle Grouping.xlsx - Excel.png

 

Thank you in advance

6 REPLIES 6
dbmurray
8 - Asteroid

I was going to suggest looking into the K-Centroid Cluster Analysis tool in the Predictive set of tools, but can you explain your data further? Why are their two variables both named 'item' with what looks to to have duplicate data (e.g a row of item A, item A, value 100). 

surajmthomas
8 - Asteroid

You could use Cluster Analysis but the dataset needs more light. The configuration of cluster requires two or more fields to be selected. 

 

Otherwise you could get your matching records first and then further condition it to split by the group numbers you have assigned

 

surajmthomas_0-1630469735919.png

 

andrefm
5 - Atom

Hi dbmurray,

To explain using the above data, assume that I have this table which represents the compatibility of 2 items.

 

Item A is 100% compatible with itself (Item A)

Item A is 90% compatible with item B and so on

If you see, the items D is only compatible with itself.

 

The expected result should group items which are most similar to each other by generating X number of groups. Because this data set is small, 3 groups is sufficient for this example.

 

Hope this clarifies a little more the problem.

Thank you

dbmurray
8 - Asteroid

Ok @andrefm , so I had a play with your data. It seems you also have a rule where you want to exclude any value <70 going by the highlighted table? If that is the case - I've filtered them out...the next step then is to use the MAKE GROUPS tool. 

 

dbmurray_0-1630477953133.png

This creates groups similar to your expected output. 

dbmurray_1-1630478023693.png

 

You can see A, B,F belong to group A, E and C belong to 'C' and D is in its own group. 

 

You might have to have a play around with the tool and your larger dataset, but its fairly easy to config

dbmurray
8 - Asteroid

Just wondering @andrefm - did that approach solve your problem? If so - it would be great if you could mark my post above as the solution. Thanks 🙂

andrefm
5 - Atom

Thx dbmurray,

The result of your solution matches the solution what I was after, but this does not work in a larger dataset as it would group starting based on the finding and not necessary based on the value ("compatibility") 

 

I came to this flow:

 

andrefm_0-1630553377906.png

The idea I got from:

 

Comparing fields for similar data and group together

 

The output is not 100% matching what I described, but the flow works OK with a larger dataset which is what I was after. 

Any other similar approach would be appreciated. 

Thank you

Labels