Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

General Discussions

Discuss any topics that are not product-specific here.

Clustering Analysis

H_Alteryx
5 - Atom

Hello everyone.

 

I am having a little bit of trouble regarding clustering, I recently started to use Alteryx, so I am hoping that someone would have an answer to my problem :)

 

Regarding the data that I am trying to analyze (Excel file):

- Sheet #1 : locations with, for each location, latitudes indicated in column A and longitudes in column B

- Sheet #2 : other locations (different from sheet #1) with, for each locations, latitudes indicated in column A and longitudes in column B

 

Illustrative example of the Excel file format (with A, B, C, X, Y, Z that are numbers corresponding to the latitude and longitude of each location):

H_Alteryx_1-1643620821701.png

 

I was trying to perform 2 types on analyzes:

- The first one only on data included in Sheet #1

- The second one between data included in Sheet #1 and data included in Sheet #2

 

--------------------------------------------------

 

First analysis:

 

The purpose is to create clusters by merging locations within a [X]kms radius, with a maximum of locations merged of [Y].

I want to have the possibility to change the [X] and [Y], to be able to have multiple outcomes of the analysis.

 

I tried to use the "find nearest" tool, but the outputs do not seem to enable me to create clusters; example :

- Location A is at [X]kms from location B, so we might think that we can create a cluster in location A (by including location B in it)

- But... Location B is also at [X]kms from locations C and D, so the cluster would have been instead in location B (by including locations A, C and D in it)

And, with the "find nearest" tool, the outputs do not show that kind of specificities, so I was not able to create clusters by using it (but maybe there is another way to do it by using the "find nearest" tool, I have not find it)

 

I have, then, tried to use the K-Centroids Cluster Analysis (I do not know if it is the right tool to perform what I am trying to do), but it does not seem to work when I try to input my data, see example below where I can not select any fields:

H_Alteryx_0-1643620652491.png

I have tried by using "run", but it always show me an error.

Besides, as mentioned before, I do not know if it is the right tool to use, so if there is a better tool to use do not hesitate to share your thoughts on it.

 

Second analysis :

 

This analysis would lead to 2 different outputs:

- First output: create clusters by merging locations from Sheet #2 in locations from Sheet #1 within a [X]kms radius, with a maximum of locations merged of [Y], but without clustering the locations in Sheet #1

- Second output: create clusters by merging locations from Sheet #2 and locations from Sheet #1 within a [X]kms radius, with a maximum of locations merged of [Y]

 

To give an illustrative example: suppose that we have:

- Location A and location B within Sheet #1 that are within a [X]kms radius

- Location A within Sheet #1 that is located near to locations C and D from Sheet #2 (within a [X]kms radius also)

- Location B within Sheet #1 that is not located near to any locations from Sheet #2

 

Thus, the outputs would be:

- First output: cluster within location A by including in it locations C and D (thus, no clusters on the sheet #1 locations, location B would remain as-is)

- Second output: cluster within location A by including in it locations B, C and D

 

--------------------------------------------------

 

I do not know if my explanations are clear enough, so feel free to ask me further questions if needed.

 

From a new Alteryx user: thank you very much for your time.

 

Best regards.

0 REPLIES 0
Labels