Start Free Trial

General Discussions

Discuss any topics that are not product-specific here.

Clustering Analysis

H_Alteryx
5 - Atom

Hello everyone.

 

I am having a little bit of trouble regarding clustering, I recently started to use Alteryx, so I am hoping that someone would have an answer to my problem :)

 

Regarding the data that I am trying to analyze (Excel file):

- Sheet #1 : locations with, for each location, latitudes indicated in column A and longitudes in column B

- Sheet #2 : other locations (different from sheet #1) with, for each locations, latitudes indicated in column A and longitudes in column B

 

Illustrative example of the Excel file format (with A, B, C, X, Y, Z that are numbers corresponding to the latitude and longitude of each location):

H_Alteryx_1-1643620821701.png

 

I was trying to perform 2 types on analyzes:

- The first one only on data included in Sheet #1

- The second one between data included in Sheet #1 and data included in Sheet #2

 

--------------------------------------------------

 

First analysis:

 

The purpose is to create clusters by merging locations within a [X]kms radius, with a maximum of locations merged of [Y].

I want to have the possibility to change the [X] and [Y], to be able to have multiple outcomes of the analysis.

 

I tried to use the "find nearest" tool, but the outputs do not seem to enable me to create clusters; example :

- Location A is at [X]kms from location B, so we might think that we can create a cluster in location A (by including location B in it)

- But... Location B is also at [X]kms from locations C and D, so the cluster would have been instead in location B (by including locations A, C and D in it)

And, with the "find nearest" tool, the outputs do not show that kind of specificities, so I was not able to create clusters by using it (but maybe there is another way to do it by using the "find nearest" tool, I have not find it)

 

I have, then, tried to use the K-Centroids Cluster Analysis (I do not know if it is the right tool to perform what I am trying to do), but it does not seem to work when I try to input my data, see example below where I can not select any fields:

H_Alteryx_0-1643620652491.png

I have tried by using "run", but it always show me an error.

Besides, as mentioned before, I do not know if it is the right tool to use, so if there is a better tool to use do not hesitate to share your thoughts on it.

 

Second analysis :

 

This analysis would lead to 2 different outputs:

- First output: create clusters by merging locations from Sheet #2 in locations from Sheet #1 within a [X]kms radius, with a maximum of locations merged of [Y], but without clustering the locations in Sheet #1

- Second output: create clusters by merging locations from Sheet #2 and locations from Sheet #1 within a [X]kms radius, with a maximum of locations merged of [Y]

 

To give an illustrative example: suppose that we have:

- Location A and location B within Sheet #1 that are within a [X]kms radius

- Location A within Sheet #1 that is located near to locations C and D from Sheet #2 (within a [X]kms radius also)

- Location B within Sheet #1 that is not located near to any locations from Sheet #2

 

Thus, the outputs would be:

- First output: cluster within location A by including in it locations C and D (thus, no clusters on the sheet #1 locations, location B would remain as-is)

- Second output: cluster within location A by including in it locations B, C and D

 

--------------------------------------------------

 

I do not know if my explanations are clear enough, so feel free to ask me further questions if needed.

 

From a new Alteryx user: thank you very much for your time.

 

Best regards.

0 REPLIES 0
Labels
Top Solution Authors