Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Nearest Point workflow with very large large data set

Chris_W
7 - Meteor

Hi there,

 

I have 2 spatial datasets. One with about 20,000 points (my stores), and another with about 1.7 million points (my customers). I'm trying to find each customers nearest store (within 50 miles drive distance), and then output my results by average drive distance per US State.

 

The problem is, I simply have too much customer data to use the Find Nearest tool. I have tried the approach here https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Alternative-to-Find-Nearest-Tool-that-Could-..., but still the run is failing due to memory limitations. 

 

Has anyone encountered a similar problem and can suggest a solution?

 

Thanks

Chris

4 REPLIES 4
andre347
10 - Fireball

I've not tried this, but what if you create a batch macro (or even an iterative macro) to loop over the stores? So run this macro basically 600 times and find the nearest customers vs store. You can filter down to one store just before the find nearest tool and then use a control parameter to iterate over the customer or store list.

Chris_W
7 - Meteor

Great idea! Thanks Andre. My end goal is to create a model that has average drive distance for each customer grouped by the customers State, and see how that drive distance changes when 100 of those stores were removed. 

 

I'm still very new to Alteryx, but would this be the best way to achieve that?

1) Get the distance for customer per store. Essentially this would produce a file of 1.7m rows x 20,000 cols (I now have 20,000 stores I am looking at)

2) Then take the min distance for each row and call that 'closest' store

3) Group my customers by State and average the drive distance

 

Sound okay or is there an easier way?

andre347
10 - Fireball

Yes that sounds about right. I would however already start with the customers State. This means you can reduce the query significantly. You can also use the customer state in your loop when you use a batch macro. So first reduce the data to one state. Then use a control parameter on store and then use the Find Nearest tool (which has inbuilt distance calculation) to construct your batch macro. 

Chris_W
7 - Meteor

Awesome. Thanks Andre. Nice challenge for a grey Friday afternoon :)

Labels