The Assisted Modeling Tool has incorrectly identified features of my dataset as unary and therefore will not allow me to include them in my model.
I am using the Assisted Modeling Tool to review the appropriateness of access based on job function. The Target Variable is the Appropriateness of the role for the data (good or requires further review). My dataset contains users granted access through sensitive profiles. Therefore, variables like the profile name, description, system name, etc. are repeated several, if not hundreds of times in the dataset, but are not unary.
Title should also be categorical and I do not have the option to change it.
Is there a way to correct the tool?
Thank you.
Hi @meganmh,
Since the Assisted Modeling algorithm does not detect predictive value in the columns identified as unary, the predictive algorithms would likely not create an accurate model with those columns as predictors.
It might be possible to change the dataset format to make it more suitable for predictive modeling. For example, you could translate the columns identified as unary into binary columns using a Formula tool. These columns would have the attribute in the column header and a value of either 1 or 0 depending on whether the row has that attribute. A set of true/false columns may provide greater predictive power than columns with only a few unique values.
Please try going to the VSC website here: Virtual Solution Center and select the option: I need help building something. You can schedule time with one of our experts to review your use case in further detail and receive suggestions on how to get the most out of the data.