This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Probably a simple question for advanced users but not for a young Padawan like me ^^
I have a file with 5000 lines, out of which more than 85% have a “NOK” output and 15% are “OK” (quite unbalanced). To properly train my model, I would like to feed it with a balanced sampling (50% NOK & 50% OK).
I start by isolating the “OK” lines through a filter and I use a random sampling on the “NOK” data.
The thing is that I need to manually define what should be the size of “NOK” samples I am looking for… according to the number of “OK”.
So I would like to use the number of “OK” samples as an input to request the same “NOK” number :
I have found replies with some inputs but more for batches than for a simple workflow.