Hi All,
I'm trying to replace the NULL values in one of my variables by the Median of that variable when grouped by another variable.
For example, imagine I have the following dataset - I want to replace the missing age values by median of all ages with the same title.
| Title | Age | 
| Master | 10 | 
| Master | 14 | 
| Master | 3 | 
| Master | NULL | 
| Mrs | 26 | 
| Mrs | 52 | 
| Mrs | 45 | 
| Mrs | NULL | 
| Mrs | 76 | 
I can accomplish this using a series of filters connected to imputes then reconnected via a union but was hoping there was a more elegant solution.
Solved! Go to Solution.
Great, thanks ![]()
Is there a way to do this with multiple columns simultaneously? I have around a hundred columns for which I'd like to replace null values with the grouped-by median values, and I'd like to not have to create 100 formulas.
