Hi all,
I built a huge datamart (more than 8 000 fields) and now I'd like to build a predictive model on a binary target.
I'd like to know if there is an algorithm, like features selection algorithm, that could make a pre-selection of fields. It'd suppress for example all the fields that are not correlated to the target to reduce the number of candidates fields for the modeling.
thanks,
Franck
Solved! Go to Solution.
Try the Importance Weights tool available in the predictive district here: https://gallery.alteryx.com/#!app/Importance-Weights/56bbd2643df7da08b8fccfe9
Thanks for your answer.
I downloaded and opened the file 'Importance weights.yxmc' in Alteryx, and do I have to do to use it on my data?
I'm not an expert yet and it's the first time I ue a macro. :)
Franck
Once you install the macro (see here), it will appear on your tool palette and you can use it just like any other tool. This particular one requires Java 7 to be installed on your computer.
Thanks!
Once I use this tool for a categorical target, how do I got about selecting the right subset of variables? It's giving me 3 different data items as outputs..
Can I sort Information Gain (1) by descending and take say the top 20? Or do you have any recommendations on determining the threshold?
Neil,
I have Java 8 installed on my computer. Is there anything I have to do differently? Whenever it runs, i get the error ".onLoad failed in loadNamespace() for 'rJava', details: package or namespace load failed for 'FSelector'"
Thanks,
Jake
Java 8 should be fine - make sure it's x64 though.