Join the Alteryx Community’s Maveryx Summer Cup event! Compete, network with others, and earn your gold through a series of challenges from July 24th to August 11th. Learn more about the event here.
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Correctly Name The Oversample Field Tool

A quite minor, pedantic issue from me today. 


Currently, the Oversample Field Tool's naming and configuration suggest that the tool can over sample data:


However, I would argue the tool under samples data instead.

Here are a few sources that explain this much better than I can:

And an image is taken from Medium:


Effectively either step is to create a similar (or same) number of records between each class. Under sampling is the process of taking samples from the majority class, and ending up with a smaller dataset than started with. Over sampling is the process of duplicating records within the minority class, and creates a larger dataset.


When using the Oversample tool within Alteryx, using the example workflow for reference:


When summarizing the input:


And the output:


It's clear that the data has actually been under sampled, in that random samples have been taken from the majority class to match the minority, rather than creating duplicate minority records. 

I would suggest a quick renaming of the tool to "Undersample Field Tool", and documentation to not cause confusion to new users of the platform.


Kind Regards,