Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Data Science

Machine learning & data science for beginners and experts alike.
CailinS
Alteryx
Alteryx

Does this sound familiar? You just watched a fantastic demonstration for advanced users on regression modeling. You think (who wouldn’t?!) “These tools look amazing…imagine what I can do!” So you jump into Alteryx and start plugging in your data! “BUT WAIT! What are all these error messages?!” - this is the stuff Alteryx nightmares are made of…

 

  Inital Errors.png

 

...but it doesn’t have to be!

 

Predictive analytics is a complex thing to tackle. There’s theory, business sense, data (why can’t they all agree?)! Errors can occur when any of those things are bad. And sometimes, just because you haven’t run the workflow! In this blog you will learn the simple steps to troubleshoot errors in the predictive tools. Once you have the basics down, you’ll get to dive into the world of R troubleshooting. I will present you with some of the error messages I’ve come across, and resolved, to give you a starting place and hopefully the courage to take on any errors you encounter. You can do this!!

 

If time is of the essence and you can’t read everything now, there is a summary of the steps at the end of the post as well as the most common causes of R error messages in the Alteryx predictive tools.

 

I’ve attached a workflow at the bottom of this post so feel free to open that and follow along.

 

The first thing you need to know about the predictive tools is that they are all macros. (If that means nothing to you, Help Menu – Sample Workflows - Tutorials – Build a Macro is a great place to start!) Because they are Macros, it means:

 

  1. They may behave differently than other tools.
  2. They contain multiple tools that you can’t see.
  3. They can be opened (right click, Open Macro) so you can see the tools and methods that make up the Macro.
  4. Some information is suppressed by default, and it could be important

 

Let’s Begin…by running the workflow

Speaking of #1, sometimes a tool will have an error message after it is configured until you run it once…so start there! As you may know, many of the predictive macros are based on R: The R Project for Statistical Computing, specifically any tool that has an ‘R’ in the tool icon! In cases like this, the underlying R code doesn’t know what data is coming until the first run. For some tools, this unknown metadata causes the tool to be in error until the data has been passed through the first time. In the same vein, it’s better to hold off on connecting and configuring downstream tools if the current tool is in an error state, because many Alteryx tools rely on known metadata for successful configuration!

 

Now for the fun stuff

In the example above there are 13 error messages and 4 warnings. Ouch. Where do we start? Well assuming you have had at least one error in Alteryx before now, you know the first step is to find the tool(s) with an error and select it (Tip & Trick: you can click on the tool name in your Results – Messages window to take you straight to the offending tool!) My advice is to pick the first one and start there (since the first error could be the cause of all the subsequent errors).

 

Create Samples Error.png

 

Error: Create Samples (27): Error: The estimation and validation samples exceed 100%.

 

This one is pretty simple – the estimation and validation samples need to sum to 100% and they are set to 133%...so take heart! Not all the errors are [seemingly] written in a foreign language. In fact, many errors will indicate exactly what the problem is and may even suggest how to fix .

 

At this point you can click on the ‘next’ error and continue fixing simple things, or you can rerun the workflow to see if some of the errors go away once the first one is resolved. Since the dataset in this example was literally built to break these tools (no pain, no gain, right?), we aren’t so lucky. In this case, the next error looks something like this Elvish.png …so what now?

 

It’s time to look under the hood

Because R is running code behind the scenes (Don’t believe me? Right click and open one of the tools with an ‘R’, and pres Ctrl + F to search for the R tool and see code inside!), some of the error messages you will encounter come directly from R and may truly feel like a different language. In R, the error message may only make sense in the context of the surrounding messages…but there aren’t any other messages! Luckily, there ARE more messages and they are just hidden. When a macro is brought into an Alteryx workflow the default setting is to show only macro errors in the Results-Messages window. This is a great thing most of the time because there could be hundreds of tools in a macro and thousands of messages, and that information would likely confuse the macro user if it were always shown. But when there is an error, the extra information can be helpful. To get the information, you’ll need to turn on the setting to ‘Show All Macro Messages’. Click on the workflow white space to see the workflow properties. Find the setting on the Runtime tab.

 

Show Macro Messages.png

 

Now re-run the workflow and find the tool with the first error. Often, the message right before or right after the first error indicates the problem. Find the tool that has the first error (Tool name (#) message) and focus on its messages (Tip & Trick: look at the messages for this tool only by selecting the tool, then clicking the Messages icon in the properties window. In this example there are 16 messages for this tool alone, and 200 for the workflow once macro messages are enabled).

 

Forest Model Messsages.png

 

Although these messages are generated by R (vs Alteryx), many of them are straightforward and easy to resolve. Here, the message right below the error references missing values being present, and missing values can cause a lot of problems in R. It is always the first thing to check for (with a Field Summary tool!) and my first suspicion when I see an error in one of the predictive tools.

 

In Alteryx, there are myriad ways to deal with missing data and fix the issue (e.g. Filter to remove rows, Formula to replace values, Select to exclude fields). The Data Investigation tools will help you find issues in the data and often indicate how to resolve the issue. For instance, is the variable important enough to fix? What value should I use to replace missing data? Those questions can only be answered by knowing your data and how important each piece is (or isn’t) to your target behavior. Re-run the workflow as usual to confirm issues are resolved. In this example, it is error free and ready for action!

 

All Done Workflow.png

 

Finally…hit the road aka the information superhighway

When the steps above don’t resolve all issues, it is time to hit the road! There will be error messages that aren’t immediately clear. Messages that tell of an issue beyond your understanding. When those messages and errors occur, know that R is a robust tool and language. It also has a very robust community! So my final piece of advice is to head to the internet and use the wealth of resources there! To get the most out of your search, only copy/paste the R error message portion (leaving out the Alteryx specific pieces). In the messages above (had we not fixed them so easily) ‘Error in na.fail.default(list(Food_Away = c(1993.33, 238.33, 694, 441.67,  :missing values in object’ the bold text represents the the R specific error. I find that pasting the error and adding ‘in R’ usually results in a helpful article or two!

 

In conclusion…you can do it!

You really can do it! These tools can open up a whole new world if you will take the time to prep, investigate, and troubleshoot when things aren't perfect the first time. Below I have summarized the whole post to make it easier to brush up when you need to apply these concepts! Good luck!

 

Steps:

  1. Run the workflow
  2. Re-Run the workflow
  3. Click on the first error message in the Results – Messages
  4. Read it Smiley Happy Fix the issue if it’s straight forward
  5. Turn on Macro Messages and re-run the workflow
  6. Find the tool with the first error message and open its messages
  7. *World of R troubleshooting from here on out* Read the messages/warnings right above and below the first error and fix the issue if it’s straightforward
  8. If it’s still unclear copy/paste the R error into a search engine and utilize the R community to guide you (stackoverflow is a great option!)

Common Causes:

  1. Missing data
  2. Bad data (e.g. long string values and/or special characters in the field names OR data)
  3. Bad theory (e.g. when there are more predictor variables than there are rows of data)
  4. Bad connection (e.g. you plugged the Report output into a tool that wants the model Object output)

Thanks for reading! Try out these tips and let us know how it goes. Please feel free to add tips of your own, or error messages that you’ve encountered and resolved when using the Predictive Tools. For ones not yet resolved, search the Forum, or create a new post!

 

Comments