Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

K Fold Cross Validation

Inactive User
Not applicable

1. When I use the simple tool to create training and testing data set, is it the fixed data set and not use the K-Fold Cross Validation skills? Basically, it is just simple validation? 

2. In the logistic regression tool, there is no option to allow me to set up K-Fold, are all the prediction tool do not have this function?

3. In the sample tool, this is the option called "random seed", what the meaning of this one? in what situation I need change the default value?  thank you.

 

 

 

 

 

 

 

 

 

5 REPLIES 5
BridgetT
Alteryx Alumni (Retired)

Hi @Inactive User,

 

I'll respond to your questions in order:

 

1. The Sample tool is deterministic under all of its configuration options except "Random 1 in N chance for each Record." Also, it only outputs a single stream of data, not 2 (which you'd need for any sort of validation). The Create Samples tool, however, is explicitly intended for separating data into an estimation dataset and a holdout dataset. Thus, the Create Samples tool can be used for simple validation. Neither tool is intended for K-Fold Cross-Validation, though you could use multiple Create Samples tools to perform it.

2. You're correct that the Logistic Regression tool does not support built-in Cross-Validation. At this time, a few Predictive tools (such as the Boosted Model and the Decision Tree) do Cross-Validation internally to choose certain hyperparameters. However, this Cross-Validation is different than the Cross-Validation used in model comparison/selection. You can expect to see a tool for model selection Cross-Validation on the Gallery in the relatively near future.

3. The Sample tool does not have a Random Seed, but the Create Samples tool does. You should change the default value if you'd like to run your workflow again with different selections for Estimation, Validation, and Holdout data.

 

Best,

Bridget

Bridget Toomey

Research Scientist, Analytic Products

Alteryx
Inactive User
Not applicable

@BridgetT   Thank you so much. This information is helpful.

 

Also, in my questions above I mentioned "sample tool",  I actually mean "create sample tool", you get my point. Thanks.

BridgetT
Alteryx Alumni (Retired)

@Inactive User: You're welcome! Glad I could help!

Bridget Toomey

Research Scientist, Analytic Products

Alteryx
gmerce
7 - Meteor

Hi, 

 

Does the Count Regression Model embarks a cross validation ? I think no. 

 

Do you have any example on how it could be possible to train a count regression model using cross validation instead of training it on a sample dataset ?

 

Thanks a lot.

NeilR
Alteryx Alumni (Retired)

@gmerce You can use the Cross Validation tool, available from the Predictive District, after a Count Regression tool. While the Cross Validation tool doesn't alter the model generated by the Count Regression tool, it is designed to generate more accurate performance measures without the need to train your model on a sample of the data.

Labels