Author: Andy Moncla ( @AndyMoncla ), Chief Operating Officer & Alteryx ACE
Company: B.I. Spatial
Awards Category: Best Use of Spatial
With Spatial in our company name we use Spatial analytics every day. We use Spatial analytics to better understand consumer behavior, especially relative to the retail stores, restaurants and banks they use. We are avid proponents and users of customer segmentation. We rely on Experian's Mosaic within ConsumerView. In the last 2 years we have invested heavily in understanding the appropriate use of Mobile Device Location data. We help our clients use the mobile data for better understanding their customers as well as their competitors' customers and trade areas.
Describe the problem you needed to solve
Among retail, restaurant and financial services location analysts, one of the hottest topics is using mobile device location data as a surrogate for customer intercept studies. The beauty of this data, when used properly, is that it provides incredible insight. We can define home and work trade areas, differentiate between a shopping center’s trade areas versus its anchors, understand shopping preferences, identify positive co-tenancies, and, perform customer segmentation studies.
The problem, or opportunity, we wanted to solve was to:
1. Develop a process that would allow us to clean/analyze each mobile device’s spatial data in order to determine its most probable home location
2. Build a new, programmatic trade area methodology that would best represent the mall/shopping center visitors’ distribution
3. Easily deliver the trade areas and their demographic attributes
And, it had to scale. You see, our company entered into a partnership with UberMedia and the Directory of Major Malls to develop residence-based trade areas for every mall and shopping center in the United States and Canada – about 8,000 locations. We needed to get from 100 billion rows of raw data to 8,000 trade areas.
Describe the working solution
Before I get into the details I’d like to thank Alteryx for bringing Paul DePodesta back as a Keynote Speaker this year at Inspire. Paul spoke at a previous Inspire and his advice to keep a journal was critical to the success of this project. I actually kept track of CPU and Memory usage as I was doing my best to be the most efficient. Thanks for the advice Paul.
Using only Alteryx Spatial, we were able to accomplish our goal. Without giving away the secret sauce, here’s what we did. We divided the task into three parts which I will describe below.
1. Data Hygiene and Analysis (8 workflows for each state and province) – The goal of this portion was to identify the most likely home location for each unique device. It is important to note that the raw data is fraught with bad data, including common device identifiers, false location data and location points that could not be a home location. To clean the data, nearly all of the 100 billion rows of data were touched dozens of times. Here are some of the details.
a. Common Device Identifiers
i. The Summarize tool was used to determine those device ID’s, which were then used within a Filter tool
ii. Devices with improper lengths were also removed using the Filter tool
b. False Location Data – every now and again there is a lat/long that has an inexplicably high number of devices (think tens or hundreds of thousands). These points were eliminated using algorithms utilizing the Create Points, Summarization and Formula tools, coupled with spatial filtering.
c. Couldn’t be a Home Location – For a point to be considered as a likely home location, it had to be within a populated Census Block and not within other spatial features. We downloaded the Census Blocks from the Census and, utilizing the TomTom data included within Alteryx Spatial, built a series of spatial filter files for each US state and Canadian province. To build the spatial filters (one macro with 60+ tools), we used the following spatial tools:
i. Create Points
ii. Trade Area
iii. Buffer
iv. Spatial Match
v. Distance
vi. Spatial Process Cut
vii. Summarize - SpatialObj Combine
Once the filters were built all of the data was passed through the filters, yielding only those points that could possibly be a home location.
Typically, there are over one thousand observations per device, so even after the filtering there was work left to be done. We built a series of workflows that took advantage of the Calgary tools so that we could analyze each device, individually. Since every device record was timestamped, our workflows were able to identify clusters of activity over time and calculate the most likely home location. Tools critical to this process included:
The Hygiene portion of this process reduced 100 billion rows of raw data to about 45 million likely home locations.
2. Trade Area Delineation (4 workflows/macros for each mall and shopping center, run iteratively until capture rate was achieved) – We didn’t want to manually delineate thousands of trade areas. We did want a consistent, programmatic methodology that could be run within Alteryx. In short, we wanted the trade area method to produce polygons that depicted concentrations of visitors without including areas that didn’t contribute. We also didn’t want to predefine the extent of the trade areas; i.e. 20 minutes. We wanted the data to drive the result. This is what we did.
a. Devised a Nearest Neighbor Methodology and embedded it within a Trade Area Macro – Creates a trade area based on each visitor’s proximity to other visitors. Tools used in this Macro include:
i. Calgary
ii. Calgary Join
iii. Distance
iv. Sort
v. Running Total
vi. Filter
vii. Find Nearest
viii. Tile
ix. Summarize – SpatialObj Combine
x. Poly-Split
xi. Buffer
xii. Smooth
xiii. Spatial Match
b. Nest the Trade Area Macro within an Iterative Macro – By placing the Trade Area Macro within the Iterative Macro Alteryx allow the Trade Area Macro to run multiple scenarios until the trade area capture rate is achieved
c. Nest the Iterative Macro within a Batch Macro – Nesting the Iterative Macro within the Batch Macro allows us to run an entire state at once
The resultant trade areas do a great job of depicting where the visitors live. Although rings and drive times are great tools, especially when considering new sites, trade areas based on behavior are superior. For the shopping center below, a ring would have included areas with low visitor concentrations, but high populations.
3. Trade Area Attributed Collection and Preparation (15 workflows) – Not everyone in business has mapping software but many are using Tableau. We decided that we could broaden our audience if we’d simply make our trade areas available within Tableau.
Using Alteryx, we were able to easily export our trade areas for Tableau.
Build Zip Code maps.
For our clients that use Experian’s Mosaic or PopStats demographics, Alteryx allows us to attach the trade area attributes.
Describe the benefits you have achieved
The benefits we have achieved are incredible.
The impact to our business is that both our client list and industry coverage have more than doubled without having to add headcount. By year end, we expect our clients’ combined annual sales to top $250 billion. Our own revenues are on pace to triple.
Our clients are abandoning older customer intercept methods and depending on us.
Operationally, we have repeatable processes that are lightning fast. We can now produce a store or shopping center’s trade area in minutes. Our new trade methodology has been very well received and requested.
Personally, Alteryx has allowed me to harness my nearly 30 years of spatial experience and create repeatable processes and to continually learn and get better. It’s fun to be peaking almost 30 years into my career.
Since we have gone to market with the retail trade area product we have heard “beautiful”, “brilliant” and “makes perfect sense.” Everyone loves a pat on the back, but, what we really like hearing is “So, what’s Alteryx?” and “Can we get pricing?”
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.