Hi All,
I'm trying to replace the NULL values in one of my variables by the Median of that variable when grouped by another variable.
For example, imagine I have the following dataset - I want to replace the missing age values by median of all ages with the same title.
| Title | Age |
| Master | 10 |
| Master | 14 |
| Master | 3 |
| Master | NULL |
| Mrs | 26 |
| Mrs | 52 |
| Mrs | 45 |
| Mrs | NULL |
| Mrs | 76 |
I can accomplish this using a series of filters connected to imputes then reconnected via a union but was hoping there was a more elegant solution.
Solved! Go to Solution.
Great, thanks ![]()
Is there a way to do this with multiple columns simultaneously? I have around a hundred columns for which I'd like to replace null values with the grouped-by median values, and I'd like to not have to create 100 formulas.
