Milwaukee, WI

Welcome to the Milwaukee User Group

Click in the JOIN GROUP button to follow our news and attend our events!

First Annual Milwaukee Alteryx Users DATA CHALLENGE

jmgrant
6 - Meteoroid

The MAUG Leadership team is pleased to announce the

First Annual Milwaukee Alteryx Users DATA CHALLENGE!

 

Participants should post their solutions to this thread, please include a workflow (with all associate assets necessary to run it) and a short presentation (PowerPoint Deck, short video, images, etc).

 

Submissions are due by November 20th.  After that we will post a method for users to vote on their preferred solution.  Winners will be announced at the Q4 MAUG meeting / Holiday party (tentatively scheduled for December 5th).

 

A zipped archive with the dataset and a PowerPoint are attached.  A full explanation of the Data Challenge, the problem, rules, etc, can be found in the attached PowerPoint.

 

NOTE: Much of this data is fictitious or randomized in some way, please do not use for any purpose other than the data challenge!!  A data explanation can be found in the attached PowerPoint.

The Problem

A jolly anonymous benefactor wants to send holiday parcels to Milwaukee homes purchased since 2002.

 

You are going to make some of the deliveries using a magic flying vehicle.

 

The catch is that the vehicle can only fly 200 miles total as the crow flies.

 

You will need to prioritize your delivery in what ever fashion you choose, there are no wrong answers and the User Group will vote on which solution is the winner.

 

The Challenge

 

1) There are some odd values in the data set, address data issues and resolve them.

 

2) Determine a method of prioritizing your deliveries (there are no wrong answers) this will affect/inform your sequence.

 

3) You may choose any location in the file as your starting point. Justify the starting point.

 

4) You must build a sequence poly-line (spatial>poly-build) with a distance of less than 200 miles as the crow flies (spatial>spatial info).

 

5) Post your workflow and a short presentation or explanation to this thread.

 

We would like to have at least some submissions do a quick presentation at the Q4 meeting!

 

A resource you may wish to consider is a post titled Santa’s First Iterative Macro on the community.
No distance component in the example, but it would be great to see enhancements!! Submissions need not be this complex or utilize macros all levels of complexity are appreciated!

 

The Rules

 

1) You must be a member of the Milwaukee Alteryx User Group (MAUG) to win.

 

2) Winners will be chosen based on votes from the MAUG community (method of vote will be announced once all submissions are received).

 

3) Winners will be announced at the 2019 Q4  MAUG Meeting / Holiday Party.

 

4) You may augment the dataset in any way you like.

 

5) Alteryx swag, other awards, and bragging rights for a year go to the winners!!

 

5 REPLIES 5
ColleenH218
8 - Asteroid

@jmgrant this is awesome! I can't wait to see what people come up with!

mmontgomery
11 - Bolide

 

Spoiler
High-level approach:
1. I wanted to clean up the dataset by getting rid of duplicate home addresses. I accomplished that by selecting the most recent sale date for that address
2. Grouped by long/lat to get counts of addresses and parcels
3. Picked most popular long/lat (43.08052,-87.88772) as starting point
4. Tweaked Find Nearest spatial tool and Sample tool to generate the most amount of parcels closest to the most popular long/lat while staying under 200 miles traveled. (3,373 parcels delivered within 199.85502 miles traveled)
5. For context on community impact, I appended the likelihood of poverty percentile rankings, total population and total distance in the raw dataset to the final dataset. The final dataset was around the 77th percentile of poverty with 3% of the population reached and 0.22% of the total distance reached.

For more details, please see the tool annotations in the attached workflow.

Thanks,
Matt Montgomery



MM Final Workflow.PNGPoverty Percentile.PNG

map.PNG

 

 

mmontgomery
11 - Bolide

Apologies. I attached the wrong workflow, this is the most updated workflow

kmcdaniel
8 - Asteroid

I've recently spent some time exploring macros and finally felt prepared to take on this challenge question.

 

My approach has roughly 3 steps:

1) Removing unnecessary variables and observations from the dataset.

2) Using a slightly modified version of the Santa's iterative macro to repeatedly find the nearest undelivered residence, assign stop numbers and calculate distances between points.

3) Tidy up results: Tabulate distance traveled and number of packages delivered, display a route map.

 

Thanks MAUG for the educational challenge and holiday whimsy!

 

 

Spoiler
1) FILTERING AND SUBSETTING
Since the original dataset was quite large and I wanted to build an iterative macro, I thought shrinking the size of the large initial dataset would likely help improve the performance speed of my macro (less rows for it to assess when repeatedly trying to find nearest neighbors).  

I use the select tool to immediately drop the columns that I'm not going to use (my Santa has solutions for delivering to residences with dogs, without chimneys and he isn't interested in socio-economic status).

Santa only wants to deliver to homes with children, so I filter the dataset to homes where children are present.

Looking at a map of where the remaining points in the dataset are located, one can see a pretty clear strip through the middle of the city where there are no homes (because of the interstate, river, train line and downtown). I decided that it's sensible to try to stay on one side of the city to prevent doing any wasted travel across this region. So I further subset the data by filtering on a line of latitude. My Santa is only delivering to kids on the North side of town.

After exploring my remaining homes, I chose a specific house (on the edge of a dense neighborhood on the east side) to give my Santa a sensible place to start his route. I used the filter and union tools to push this record to the top of the dataset. 

Lastly, I add a blank column named DistanceMiles, which my macro is expecting.

DataPrep.png


2) SANTA'S ITERATIVE DELIVERY MACRO

Mine is essentially the same as the "Santa's first iterative macro", with two minor changes.
First, It's built to expect my subset of columns from the MAUG dataset as it's input, rather than the used in the Santa's iterative macro demo. 

Second, this technique CAN be used to calculate distances. Unlike in the original macro, I'm outputting the DistanceMiles field from the FindNearest tool and carry it through the selects and joins.

 

Briefly, the iterative macro takes the top record in the dataset and finds its nearest neighbor in the dataset. The top record is then output, along with the distance between it and its neighbor in the DistanceMiles field and the Stop Number (iteration number). The nearest neighboring record is pushed to the top of the dataset and the process begins again. From the interface designer, the macro can be set to run as many times as desired. 

 

Santa_macro_innards.png

3) LAST STEPS AND LOOK AT RESULTS
Both the santa's first iterative macro and my delivery macro seem to sometimes get hung up on certain records and output the same record more than once.  I'm cleaning this weirdness up with the Unique tool then resorting the dataset by StopNumber. I use multi-row formulas to keep track of the cumulative number of packages delivered and miles traveled as Santa goes about his route.  I filter any iterations (deliveries) made after the 200 mile stipulation in the problem statement.  I use polybuild to draw Santa's delivery route and a report map to show both the delivery route and location of the delivery points.

final_steps_pic.png

Map of Santa's first 50 stops:

Map50.png

 

 

In 200 miles, my Santa makes 3980 deliveries and hands out a total of 12278 parcels. His full route for the night is below.

Map5000.png

 

kmcdaniel
8 - Asteroid