Hi there,
The imputation tool is reducing the number of columns in my data set. I am feeding in 100+ columns but only looking to impute 5 of them. The output of the tool has < 100 columns which seem to be dropped at random. Unless I am using the tool incorrectly, can you please advise on this?
Thanks,
David
Solved! Go to Solution.
HI @dcleere,
I've moved your post over to data preparation and blending. Can you post a sample workflow for us to take a look at ?
Thanks,
If the imputation is simple, I prefer to use the multi-field formula tool.
As an example, to remove NULL values from string variables and make them blank
IIF(IsNull([_CurrentField_]),'',[_CurrentField_])
or to make numeric variables that are NULL as zero (0)
IIF(IsNull([_CurrentField_]),0,[_CurrentField_])
In the configuration, you choose the type of variables to operate on and then select the fields. You can either create "NEW" variables or update them directly.
Cheers,
Mark
Hi Jessica,
Please find sample workflow attached. It is mock data and the earlier steps in the workflow mock what is contained in the workflow.
- Select unqiue on Col1
- Impute the last three columns in the data set
You will see the number of columns drops from 113 to 111 following the imputatation tool.
Thanks,
David
Hi @dcleere,
We will review this internally and follow back up with you.
Thanks,
Hi @dcleere,
I was able to take a look here. Under the hood, the tool has a regex statement to deselect the imputed value indicator field and the secondary imputed values field when the below settings are checked.
The regex inside the dynamic select tool is looking for fields that contain _ followed by a string containing either mp or nd (the fields created by the tool have suffix of either _Indicator or _ImputedValue)
It just so happens that your field names match the regex written with _INDUST.
To Navigate around, rename the fields to a name that does not contain 'IN' after an _ such as INDUST_ERNINGS_AGE or select output 'imputed value indicator.'
@MarqueeCrew's suggestion is a great workaround as well.
I will alert our development team about this issue.
Thanks,
Hi Jessica,
I'm trying to complete a tutorial and am using the Imputation tool to remove null values and replace with a median value. The problem I'm experiencing is that only the very first column of the data-set is appearing under the 'Fields to impute'. Why don't I see all of the columns?
Thank you,
Tom
Hi @tjschubert,
Could you create a new thread on your question and include the workflow as an attachment so we can take a look?
Thanks,
Sorry, but how do I start my own question thread? I don't see the option to do that.
Thanks,