Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Applying a PCA model to new data

aatalai
14 - Magnetar

Hi All

 

Wanted to ask after using PCA, does anyone know how you can apply the model generated to new data? Similar to PCA transform here https://stackoverflow.com/questions/26182329/how-do-i-convert-new-data-into-the-pca-components-of-my...

 

 

 

 

 

TA

5 REPLIES 5
BS_THE_ANALYST
14 - Magnetar

@aatalai might be worth putting this post in the ML forum aswell https://community.alteryx.com/t5/Alteryx-Machine-Learning-Discussions/bd-p/machine-learning 

All the best,
BS

BS_THE_ANALYST
14 - Magnetar

@aatalai 

Am I right in thinking you've done PCA on some data and you know want to take that model and apply it to new data?

If yes, I believe there's a lesson in the predictive lessons on community where it shows you how to take a trained model and apply it to a new dataset:
https://community.alteryx.com/t5/Interactive-Lessons/Creating-a-Predictive-Model/ta-p/504767 

 

Here's a picture of where to look on the lesson:
1.png

 

Hopefully that answers your question @aatalai .

All the best,

BS

aatalai
14 - Magnetar

@BS_THE_ANALYST 

 

Thanks your responses. 

 

In terms of part a) as this is more of tool in designer rather than ML techniques thought it would suit better here but I'm agnostic on the forum

 

b) There is no model object from the PCA output - I also added it for completeness  see screenshot below

 

PCA score.PNG

BS_THE_ANALYST
14 - Magnetar

@aatalai can completely see the issue at hand. It doesn't output a model object. Therefore, at least in Alteryx, each time you'd need to run the new set of data through PCA.

 

I can see on the article you linked there's a pipeline occurring to fit the initial model, and from that point forward you just need to transform the new data and then you can predict.

 

Might be a good angle to use the python tool and implement this yourself? I learned how to build pipelines from here in the past: https://www.kaggle.com/learn/intermediate-machine-learning there's a section on pipelines which is useful. You don't have to use a pipeline though - the response with 10 likes has the sequential steps needed, I believe. I imagine the created tool would have an interface like "import model from ___" or "create model". Could consider doing this https://datascience.stackexchange.com/questions/55066/how-to-export-pca-to-use-in-another-program 

 

All the best,

BS 

Labels