Showcase your achievements in the Maveryx Community by submitting a Success Story now!
SUBMISSION INSTRUCTIONS
Overview of Use Case
OnePlus Systems Inc. is an Illinois-based manufacturer of container fullness and control systems. Their monitoring and security solutions make it possible for some of the world’s largest retailers, manufacturers, property managers, hospitals and hospitality operators to reduce costs, increase operating efficiency, and gain insight into the daily functioning of their businesses. In this use case the product manager shows the life cycle of model building, how Alteryx tools were used to build a predictive model, and the overall iterative predictive process.
Describe the business challenge or problem you needed to solve
Our challenge was that we needed to increase renewal rates without increasing costs for our organization. So we had to build a predictive model to target at-risk members. We then shifted resources based on the business need, so if you're in the more likely to renew group, we'll send you an email. If you're not likely to renew, you'll review a personal call. This however was an arduous and iterative process. It starts with business understanding, and goes through deployment, and then back again to business understanding to verify the models’ validity/accuracy. You need to know who the stakeholders are, define the business problem(s) and understand the data, focus on data prepping, modeling, evaluating, and deploying.
Describe your working Solution
The first part of the predictive modeling process starts with business understanding. You need to determine what you need to solve. You understand why you need to solve it. You need to understand the data, including who has the data, where's the data located, how good is the data, how confident are you in the data but you're not going to determine that alone.
You need to collaborate with experts. By that I mean both internal experts and external experts. A great external reference is the Alteryx Community, they're great to bounce ideas off of for solving problems. You’ll also want to talk to all the internal stakeholders, be it finance, operations, marketing, whoever's involved. Bring them in to have a conversation so that you can understand the breadth of the problem, the scope of what they need, and better understand the information you do or don’t have.
When we talk about Internal experts, there are kind of 3 kinds you're going to run into. First there are data workers. These aren't necessarily people in your immediate group, but anybody in your organization that has data ranging from individual contributors to executives. The second group is our data advocates. These people grasp the importance of your objective. They don't always get it, but they're enthusiastic. You can move people from the data advocate group to the data worker group, and back and forth. The last group are the data resistors. They're protective of the status quo. They want to do the things their way. They are less prevalent as the world becomes more data focused, but they do exist.
That being said, it's very important to have, a C-suite, or a Vice President, or somebody with some kind of influence in the data worker or data advocate group. They can kind of help push that along and kind of get rid of some of those barriers, which are going to occur.
The next step is to understand the data, we had different data from a several sources (i.e. Serum data, demographics, volunteer information, purchase history). We had market automation software, we tracked clicks, opens, etc. We even bought data from a third party. We had Excel spreadsheets are all over the place and we had to tie our data together. Thankfully, we knew these all these sources existed, because we had that business understanding conversations, this is how one step in the process is tied to the next.
Up to this point we haven’t actually used Alteryx, the first two steps were just getting data, and understanding data. Now, we can start to prep the data. You want to simplify your data set where ever possible, the Summarize tool is great for doing that. R doesn't work well with blanks or nulls, in order to get around that, use the Field summary tool, as you can see in the example below.
I took a class about 3 Inspires ago, on predictive, and they said field summary tool is the most important tool for predictive, and I agree. If you don't use it, you should start, particularly for predictive or any larger data sets. The Field summary tool gives you a good snapshot of what data you have, what data's missing, and where your nulls are.
Now for the modeling part, we're actually going to get into the data, and build the first iteration of the predictive model. There are 4 types of models that I ran through: decision tree, a forest model, logistic regression, and the boost model. When I did this, the model comparison tool was of great help and is a great tool to use however I believe this is currently being updated to some extent.
Of the models the first is decision tree, its a bunch of “if-then” statements. Then there is the Random Forest model which is a bunch of decision trees put together. If you have small data sets, Random Forest is great. It does take some time to run though. Then there is Logistical Regression models, which calculates the distance to the average, and lastly the Boost model, which is an optimized logistical regression. Alteryx has great training on these models and other related items. So, whether its the community, the videos, or just talking to your sales rep, they can probably get you training as well.
After building the model comes the evaluation step, you’ll want to evaluate your model. Your results should pass a “smell” test or validation test. I took 6 trends throughout the data, as an explanation sample, and 40% as a validation sample. It's pretty standard. You can play with the numbers. I wouldn't deviate too much from that 60/40. The most accurate model was the forest model, which was up by 95. Decision tree was very similar. And when you're building models and looking at data, don't do it with blinders, there's other insights that can come out of your model that can be used for other purposes.
After the modeling and evaluation process we deployed. We ended up just taking the whole file and ran it against the model. We didn't deploy to the Cloud though. Our renewal cycle is set for every year, once a year. Again, it's an iterative process. So, you will continue to refine your models as you evaluate with your data. Ultimately you want to look for valuable insights. In our case we wanted the model to mirror membership trends, that was something that we relatively understood and confirmed. It wasn't necessarily a part of the model, but something that we took and learned from. And that foresight came from the initial business conversation. The business believed the market was that way. We looked. We confirmed. That's basically the model. And then they created campaigns based upon that insight we provided.
In terms of takeaways, remember Model building is a process to help provide insight and you'll go through it. Alteryx can make data scientists out of anyone. Again, I'm not classically trained as that, but have been able to create this model. The important part is the process to get people through the business understanding, and make sure you have the right problem. Make sure you understand the problem. Don't be intimidated by the process or terminology. Feel empowered. Alteryx will help you through the process. Don't be complacent, don't build a model and just push it off. The only way things will change is if you put forth the effort to learn and make it change. The data, the market, the model and your world will change if you put in the effort.
Describe the benefits you have achieved
With Alteryx, I can do more in less time. It allows me to do more important things, like play hockey with my friends and go out with my wife, and play with my dog. With Alteryx, I'm empowered to do things. Hopefully you feel as well, that you can be empowered to actually make some impacts and be predictive. Alteryx makes it easy for anyone to be a citizen data scientist. You shouldn’t be intimidated by the process or terminology, Alteryx will allow you to cut through those barriers. Community is a great resource for people to bounce ideas off of, when you're talking about problems or building models. There are plenty of great tools to assist you, like the Summarize tool. I’ve heard the summarize tool referred by as salt, because it's used in everything, and I totally agree. It's a great tool.
Related Resources
The entire PowerPoint presentation can be found here
Eric, this is awesome! Any chance you could post a sample workflow? Curious to see how you evaluated accuracy.
Awesome .
Tagging Eric @eokunevich here who is the author of this great story and can provide more details!
@shawnbres , so I can't post the workflow, it was done at a previous employer, but I can talk to evaluating accuracy. It was a little bit of a process, since i had to prove the concept to management as whole. When I initially trained the model I had a randomized hold out sample of about 40%. I took my sample ran it against my historical hold out sample and got a result. That was my first pass; and I this was the process I used to iterate the model until I felt comfortable. Normally that would be enough; however, we had an advantage of there being a renewal season (everyone renewed at the beginning of the year). So we ran the model and predicted what the results would be and then did the normal process. We then could compare what we expected vs what happened at the individual member level.
I hope that helps, let me know if you have additional questions.