Alteryx Designer Desktop Discussions

pierrelouisbescond · ‎02-14-2019

Dear Community,

Probably a simple question for advanced users but not for a young Padawan like me ^^

I have a file with 5000 lines, out of which more than 85% have a “NOK” output and 15% are “OK” (quite unbalanced). To properly train my model, I would like to feed it with a balanced sampling (50% NOK & 50% OK).

I start by isolating the “OK” lines through a filter and I use a random sampling on the “NOK” data.

The thing is that I need to manually define what should be the size of “NOK” samples I am looking for… according to the number of “OK”.

So I would like to use the number of “OK” samples as an input to request the same “NOK” number :

I have found replies with some inputs but more for batches than for a simple workflow.

Thanks,

Pierre-Louis

paul_houghton · ‎02-14-2019

Hi Pierre-Louis,

You probably want to have a look at the oversampling tool. Sounds like the exact result you are after.

https://help.alteryx.com/2018.4/Oversample_Field.htm

pierrelouisbescond · ‎02-14-2019

Thanks a lot @paul_houghton! The option was not ticked on my Alteryx update... so I was not aware of this option :-)

paul_houghton · ‎02-14-2019

No problem there are a lot of tools in alteryx so knowing which one works best in a situation can be a challenge. Glad that helped.

Alteryx Designer Desktop Discussions

Samling Tool - How to input the desired number of records N?