If you’ve been working with data in your day-to-day job and are interested in machine learning, but you feel like it’s out of reach, I’m here to tell you that it’s not.
*Automated Machine Learning has entered the chat*
One of the things I love about Alteryx is its slogan, “Analytics for All.” But the democratization of data insights doesn’t stop at analytics—with Alteryx Machine Learning, analysts and line of business professionals can learn about data science and build predictive models with their data.
In this article, I will walk through a few challenges of machine learning (ML) and how Alteryx Machine Learning can help.
Understanding your data
It can be hard to know if your dataset is in good shape. Putting your data in a predictive model is like asking it to run a marathon. It’s strenuous. Is your data healthy enough?
In Alteryx Machine Learning, once you import your data, you can immediately check your data health and address any flagged issues. For this article, I am working with a dataset with credit card information to predict customer churn at a bank. Here is what the data health summary looks like:
Of course, I still have to use my judgment and keep an eye on outliers in my process. Still, it is amazing to see all of this information presented at the beginning so that you can know off the bat if your data will be problematic for a machine learning model or not.
You can also see in the screenshot that one issue was found with my data. When I click on the details, I can see that I have an ID column in my dataset. Why is that bad? Well, when I click on Fix Data, I am told what action I should take and why.
Once you know your data is healthy, it is important to understand the data and the relationships between columns. In the data insights step, you can see a correlation matrix and then select any of the values to see a bivariate plot. These visuals speed up the time it takes to understand variable relationships.
Choosing the right model
When you are new to machine learning, the amount of model options is overwhelming. Auto ML helps lessen the cognitive load and allows you to compare many models simultaneously on different criteria.
There are two steps when choosing a model. First, you must know your target variable and machine learning method—whether you have a classification or regression problem. Hopefully, you have an idea of what these should be based on what you are looking to predict, but if you are unsure, the education mode in Alteryx Machine Learning can help!
You can learn about each type of machine learning problem by expanding the drop-downs. Now both you and the machine are doing some learning.
Then you have to decide which model to choose. After running the auto modeling process, here is what I see:
So why is it recommended that I pick the model with a score of .99 for my classification model? Well, you can see on the right side that the ranking metric is “Area Under Curve (AUC).” In education mode, I can get more context about this metric:
Great! Now I can make an educated decision about which model to select. I can also toggle between different ranking metrics to find the best model for my needs.
Explaining your model
If your model performs well with cross-validation and holdout data, and you are confident it can help your business, it's time to get others on board! If you find yourself worried about teaching others concepts that you just learned, there is an excellent feature of Alteryx Machine Learning that can help you walk through the model.
In the Export and Predict stage, you can export a PowerPoint with visuals generated earlier in the process. And the slides contain not only the images but also a short description to help with interpretation.
Conclusion
This article discussed some challenges in building machine learning models and how Alteryx Machine learning can walk you through the sometimes steep learning curve. I didn’t cover every feature—automated feature engineering and simulations are also great parts of the application!
I encourage you to try it out for yourself. Even if you have zero experience with ML, you can use data from your company to play around and learn.