Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Predictive - Logistic Regression Error

KarlWang
7 - Meteor

Hi, I am new user. I am useing Predictive - Logistic Regression to make the predication.

 

Error: Logistic Regression (2): Tool #10: Tool #30: The field "Fit_Stats" is not contained in the record.

 

what is  'Tool #10:' and  'Tool #30:' means?

what is  "Fit_Stats"? this is not my factors? where is it come from? why the error message have this  "Fit_Stats"?

 

Thank you so much.

13 REPLIES 13
michael_treadwell
ACE Emeritus
ACE Emeritus

Each tool in an Alteryx workflow has a unique ID.

 

To determine a tool's ID, click on the tool and click on the 'Price Tag' icon to the left of the configuration window.

 

Capture2.PNG

 

You should see a screen similar to the one below with the Tool ID in the top right corner

 

Capture.PNG

 

As for fit_stats in your logistic regression it could be one of many issues. Could you upload a module so that we could take a look?

MarshallG
8 - Asteroid

I've gotten this error previously using Logistic Regression and I believe (but am not certain) that what solved it was making sure that I excluded any columns that have only one constant value (eg all the values are 1's) from the list of predictor variables.

 

I agree that having your exact workbook might allow us to help further if there is something further.

KarlWang
7 - Meteor

Hi Michael,

 

Thank you for reply. What is the "module" means in your answer? How can I get my module? Thank you so mcuh for you kindhearted.

 

Btw, there is another error: Error: Logistic Regression (2): Logistic Regression: Error: cannot allocate vector of size 4.4 Gb.

Is it the memory problem? My computer is 8GB, is it not enough? My data is less than 30,000 rows and I choose about 5 to 7 factors as my variables, it is not large.

 

 

KarlWang
7 - Meteor

Hi MarshallG,

 

Thank you for your reply.

None of my variables are constant value, so it is not the problem.

What is the  "exact workbook"? The error information or the data sample?

 

Btw, the other error is "Error: Logistic Regression (2): Logistic Regression: Error: cannot allocate vector of size 4.4 Gb", have you met it before?

Thank you.

MarshallG
8 - Asteroid

You should have enough memory for that. As you said, it is not a large number of rows or columns.

 

When Michael says 'module', he means literally upload the Alteryx workflow on which you are having the problem (with dummy data if need be).

CailinS
Alteryx
Alteryx

It is important to remember/know that all of the predictive tools are actually Macros (.yxmc). That changes a few things during troubleshooting. It turns out that a macro is just a workflow wrapped up behind a tool icon/interface. It has its own tools/messages/errors/warnings but by default, only errors float up to the workflow/module where you are using the predictive tool (macro). When you get errors it can be helpful to see ALL the messages (not just errors). I find that the message right before or right after the error often tells me what is wrong.

 

Go to your Workflow Properties: Runtime Tab. Select the check box to 'Show All Macro Messages' and rerun your workflow. Check out the new messages you'll see and let us know if that helps!

 

2015-11-19_13-16-49.jpg

 

Fit Stats is an object in the final report (what comes out of the R output from the Logistic Regression tool). For some reason it isn't successfully created due to a data issue (perhaps a long variable name, missing data, special characters, or a number of other things R does not like) that the macro messages might tell us more about!

Cailin Swingle
Customer Experience
KarlWang
7 - Meteor

Thank you for the solution. I have solved the problems yesterday.  But you answer also helpful, cos I don't know there is the "Show All the message" before.

 

I still have a small question, it is about the how the model running.

In Predictive - Logistic Regression, I have two kinds of variables.

One of them are continuous variable, such as X1= [1,2,4,5,7,8,13,,,,,100] ,  X2= [0.11,  0, 23,  1.24,  2.33, ...   9,87,  9.96]

The other variables are differents levels, such as X3= [10%-20%, 10%-20%, 10%-20% , 20%-30%,  ...., 80%-90%, 80%-90%,, 90%-100%], all the data is belong to one of the category.

For example, there are 10,000 rows data, so there are 10,000 X1,  10,000 X2, 10,000 X3. If I set X1, X2, X3 as the predictor variables, it will be running extremely slow, and finally I got nothing, just the error"Error: Logistic Regression (2): Logistic Regression: Error: cannot allocate vector of size 3.2 Gb".  But if I changed X1 into several categories, like X1= [1-10, 1-10, 11-20, ....91-100 ],  X2= [0-1, 0-1, 1-2, 1-2, ...9-10],  then the model works, the error solved.

 

So here is the problem,

1. What is different  to use the  continuous variable directly VS use the categorised variables?

2. Why is it extremely slow to use the  continuous variable and result in the memory problem?

 

I hope I described clear enough...

Thank you.

MarshallG
8 - Asteroid

Is it possible that Alteryx thinks that your continuous variable is actually a string and not a float or double. If the tool is trying to evaluate your number as a string (it would try to create a factor for every single value, I think) and it would likely error out.

CailinS
Alteryx
Alteryx

@MarshallG is on to something. Check with a Select tool right before your model to make sure that the continuous data is specified as a numeric type!

Cailin Swingle
Customer Experience
Labels