How to join back my Summarised data?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello All,
Sadly I am unable to share data, but I hope my example gives enough detail.
I have the following data:
- 24000 different customers, each with an exact Lat/Long. I have given each of these a unique recordID and create at Spatial Point for each
- 2280 different stores, each with an exact Lat/Long and I have created a spatial point for each.
I have then appended the Customer with the stores and been able to fine the distance for each customer to the nearest stores (Using same logic as https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Distance-between-two-sets-of-Lat-Longs...) This gives me some 50.8M combos!. The data looks something like this:
Record ID (Customer) | Store ID | Distance in Mile |
1 | 1 | 50.2123123 |
1 | 2 | 1.3127363 |
1 | 3 | 13.3191057 |
What I need to do is somehow get the shortest distance and Store ID for each customer (Record ID). I have tried using the summarise tool to find the shortest distance for each customer (Summarised as group by Record ID, min(Distance)) which worked, but I don't know how to link that back to the customer and keep the store ID?
I have tried matching joining by Record ID and distance in mileage to the summarised minimum distance in mileage but it failed to correct match
Any ideas? Thank you
Solved! Go to Solution.
- Labels:
- Data Investigation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Bobbins ,
If I understood correctly I think you can, instead of summarizing, sort your data by Customer ID then Distance ascending and use a sample tool to keep 1st line for each Customer ID.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Bobbins
Can you provide more sample data with at least 2-3 rows for each storeID. We can look into it 🙂
What @Jean-Balteryx suggested should work. Summarize should also work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Jean,
Okay, thats a different way, I can understand the sort part, but how do i use the sample tool to keep the 1st line? Change N = number of stores if i am correct?
Thanks
Bob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Bobbins
Just a thought since this is spatial why not diretly use Find nearest tool. Find the nearest store for each customer.
https://community.alteryx.com/t5/Interactive-Lessons/Identifying-Nearest-Objects/ta-p/154106
This should save you appending and finding the nearest. Which is computation heavy.
Number of rows will also be less in find nearest.
Hope this helps : )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Use the option "First N Rows", set N = 1 and tick your Customer ID field in the "Group by column (optional)" section.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
I wanted to say that though I used Jean-Balteryx ideas, I went back and looked at the solution provided by atcodedog05 which just blew my mind, my old way was taking about 5 minutes, that way took 2.6 seconds.
Thanks all
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
His solution is way more efficient ! 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Bobbins
Yes, in this scenario find nearest would be the best way to do it 🙂
Happy to help : )
Cheers and have a nice day!
