Assisted modelling tool in designer
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi all
Wanted to ask does anyone know if for the assisted modelling tool we need to split the data into training and validation data set? As II saw some documentation from a couple years back that said yes, but at inspire in 2022 they said not required as it is all done within the tool and as part of the scoring of the different models; can't see any documentation to support this so would be great to validate this either way?
Many Thanks
- Labels:
- Question
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @aatalai,
Great question. For Alteryx Intelligence Suite used on Designer Desktop use the Create Samples tool to split into the sample sizes you need.
In the Alteryx Machine Learning platform for Cloud this step is already automatically included, this is perhaps what you recall from Inspire.
I hope this helps,
Regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@c-lopez thanks for your response. Is there a tool for the python tools to compare the models based on the unseen data (validation data set). As for the R tools there is the model comparison tools?
Ta
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Great question again. The path of least resistance would be to create the samples as I mentioned before and then leave your validation set as the input for the scoring tool that way you can compare yourself after analysis. Not ideal but this is the closes to what you want to accomplish.
For the R based tools you can use this model comparison tool.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@c-lopez Thanks so there is no python prebuuilt one to calculate F1/AUC/RMSE etc? For scoring the unseen data?
I'm comfortable with the r tools and splitting the data
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello
Yes, it was necessary to manually split the data in the past, but most assisted modeling programs now take care of this automatically during model validation. It's a good idea to consult the tool's individual documentation, though, as some might still support or permit bespoke splits. If you're not sure, you could find out the solution by doing a simple test or contacting the support staff.
Regards
David Warner
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi!
From what you’ve mentioned, it seems like there might have been some updates to the assisted modeling tool since the older documentation. Typically, modern tools often incorporate automatic data splitting for training and validation during the model training process to streamline operations and improve model accuracy without manual intervention.
However, since the most recent events and your current documentation don’t align, I’d recommend checking the latest version of the tool’s official documentation or reaching out directly to the tool's support team for the most accurate and updated information. It’s always best to verify to ensure you’re using the tool effectively!
Hope this helps, and you find the info you need!
