The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Correctly Name The Oversample Field Tool

Hello!
A quite minor, pedantic issue from me today. 

 

Currently, the Oversample Field Tool's naming and configuration suggest that the tool can over sample data:

TheOC_0-1661438399726.png

However, I would argue the tool under samples data instead.

Here are a few sources that explain this much better than I can:

And an image is taken from Medium:

TheOC_0-1661435857789.png

Effectively either step is to create a similar (or same) number of records between each class. Under sampling is the process of taking samples from the majority class, and ending up with a smaller dataset than started with. Over sampling is the process of duplicating records within the minority class, and creates a larger dataset.

 

When using the Oversample tool within Alteryx, using the example workflow for reference:

TheOC_1-1661437548337.png

When summarizing the input:

TheOC_2-1661437603543.png



And the output:

TheOC_3-1661437612224.png

It's clear that the data has actually been under sampled, in that random samples have been taken from the majority class to match the minority, rather than creating duplicate minority records. 

I would suggest a quick renaming of the tool to "Undersample Field Tool", and documentation to not cause confusion to new users of the platform.

 

Kind Regards,

TheOC

5 Comments
IraWatt
17 - Castor
17 - Castor

Great spot @TheOC ! 😄

AlteryxCommunityTeam
Alteryx Community Team
Alteryx Community Team
Status changed to: Accepting Votes
 
cgoodman3
14 - Magnetar
14 - Magnetar

How about also adding in functionality to give the user the ability to chose whether they want to over or under sample their data?

 

TheOC
15 - Aurora
15 - Aurora

Love the idea - I propose a dilemma... what do you then call the tool that can both over and under-sample data? 😂

JamieHankins
7 - Meteor

I would also like a tool that provides the option of under or over sampling