Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Score tool - target field has an oversample value

apimentel
5 - Atom

Hi,

 

It is not clear to me if the field on the score tool called "The target field has an oversampled value", is refering to the training (evaluation) or testing (validation) data.

Could you please advise?

 

Many thanks

3 REPLIES 3
michael_treadwell
ACE Emeritus
ACE Emeritus

Oversampling is done to adjust the ratio of categories represented in your data and can be accomplished with the Oversample Field tool. The classic example is male/female ratio. If you have collected a population sample to train a model and your sample contains 65% males, you may want to oversample the females in your population sample so that your sample closer represents the actual wider 50/50 population.

 

When this is done, the Score tool needs to know it is dealing with an oversampled value so that it can help correct for the selection bias.

Inactive User
Not applicable

so when the target value is not around 50%,  I need put value in this field?

 

For example, if my target value is binary 1/0.  1 is 20% and 0 is 80%.  So what should I put in "The value of the target field that was oversampled" and what should I put into the percentage?

 

Thank you.

RodL
Alteryx Alumni (Retired)

The Help article on the Score tool states: "If this option is checked the user will be asked to provide the level of the target field that was oversampled and the original percentage of the sample that level represented. This information will be used to adjust the fitted probabilities to match the true sample percentages."

 

So IF you adjusted your data so that the 1's and 0's were at a 50/50 ratio (i.e., you "undersampled" or reduced your 0's to get to that ratio), then you would want to put a "1" in the value and 20 in the percentage. (But if you didn't make any adjustments to your original data, you would not check the option at all.)

Labels