Hi All,
I'm trying to replace the NULL values in one of my variables by the Median of that variable when grouped by another variable.
For example, imagine I have the following dataset - I want to replace the missing age values by median of all ages with the same title.
Title | Age |
Master | 10 |
Master | 14 |
Master | 3 |
Master | NULL |
Mrs | 26 |
Mrs | 52 |
Mrs | 45 |
Mrs | NULL |
Mrs | 76 |
I can accomplish this using a series of filters connected to imputes then reconnected via a union but was hoping there was a more elegant solution.
Solved! Go to Solution.
Great, thanks
Is there a way to do this with multiple columns simultaneously? I have around a hundred columns for which I'd like to replace null values with the grouped-by median values, and I'd like to not have to create 100 formulas.