Past Analytics Excellence Awards

Excellence Awards 2016: Michael Barone - Most Time Saved

Author: Michael Barone, Data Scientist

Company: Paychex Inc.

 

Awards Category: Most Time Saved

 

We currently have more than two dozen predictive models, pulling data of all shapes and sizes from many different sources.  Total processing time for a round of scoring takes 4 hours.  Before Alteryx, we had a dozen models, and processing took around 96 hours.  That's a 2x increase in our model portfolio, but a 24x decrease in processing time.

 

Describe the problem you needed to solve 

Our Predictive Modeling group, which began in the early-to-mid 2000s, had grown from one person to four people by summer 2012.  I was one of those four.  Our Portfolio had grown from one model, to more than a dozen.  We were what you might call a self-starting group.  While we had the blessing of upper Management, we were small and independent, doing all research, development, and analysis ourselves.  We started with using the typical every day Enterprise solutions for software.  While those solutions worked well for a few years, by the time we were up to a dozen models, we had outgrown those solutions.  A typical round of "model scoring" which we did at the beginning of ever y month, took about two-and-a-half weeks, and ninety-five percent of that was system processing time which consisted of cleansing, blending, and transforming the data from varying sources.

 

Describe the working solution

We blend data from our internal databases - everything from Excel and Access, to Oracle, SQL Server, and Netezza.  Several models include data from 3rd party sources such as D&B, and the Experian CAPE file we get with out Alteryx data package.

 

Describe the benefits you have achieved

We recently have taken on projects that require us processing and analyzing billions of records of data.  Thanks to Alteryx and more specifically the Calgary format, most of our time is spent analyzing the data, not pulling, blending, and processing.  This leads to faster delivery time of results, and faster business insight.