HI ,
I'm very new to alteryx . I have a data set containing wine features with one column called "quality". (0 to 10) . I would like
1) change or create a column base on the quality column so I could see all the number between 0-to 5 = non profitable and 6 and up to profitable .
2) I don't know how to do it : with the same column "quality" , separate the number like this depending if it's white or red. It's the profit for each wine .
Wine quality | White wine | Red wine |
9-10 | 50 CAD | 60 CAD |
7-8 | 25 CAD | 20 CAD |
6 | 5 CAD | 5 CAD |
1-5 | 0 CAD | 0 CAD |
How can I manage to do that ? I have attavhed my workflow .
thank you very much for any tip or help on this .
Hello @domax37 ,
Your workflow looks good for now. I don't know if this is what you are expecting or something similar.
Regards
Hi ! thanks a lot replying and thank you for trying to help me out .
it's not quite that.
My objective is to run a prediction/classification analysis based on all the features contains in the data base ( acidity , suger level , ph etc) and run differents models ( i think I will try a logistic regression , k-means clusters and decision tree (random tree) , in order to build a model that would decide the next selection of wine based on profit and quality . But before trying to do that , I want to change the information in the column "quality" , because under 6, (0-5) it's not quality ... over 6 , it's considered as a good quality.
maybe I should change quality vs non quality right away and then after I run the models I would change the information as you just did ? (based on profit)
do I make sense ? Not sure , i'm so confused... 😀 How would you do it ?
catherine
Hi @domax37
FYI, For fun I ran your data through our assisted modeling (Intelligence Suite) and all 4 classification models (XGBoost, Random Forest, Logistic Regression, Decision Tree) predicted your quality/not quality category assessment at around a 75% accuracy using the predictor variables in the data. Interestingly, the bulk of the false positives were 5s and the bulk of the false negatives were 6s.
Thank you ! what is the inteligence suite ? It's something outside of Designer Alteryx?
Can I do the same thing within Designer ?
thanks
catherine
Hi @domax37
The Intelligence Suite is an additionally licensed component that can be added to Designer that includes what is called Assisted Modeling/Machine Learning. It gives analysts with limited Data Science experience (like myself) access to advanced predictive modeling features, but more importantly the proper guidance through the modeling process.
Designer on its own can do these things well, but you need the data science knowledge to make it happen. Intelligence Suite takes care of that for those of us who need help with key steps like selecting the correct model for our use case, properly profiling and prepping the data variables, and managing the model output.
I like to say that Intelligence Suite is like hiring a chauffeur when you own a car but you don't know how to drive it. Your account manager can get you set up with a demonstration and maybe even a trial if you are interested.
Hope this helps.
Phil
Hi Phil .
Last question for you and thank you again for your help . FInd attached my workflow , I'm still trying to figure out if those results are ok or very bad. From the ROC curve I would choose the random forest one . But I don't see what you said about the false negative and false positive.
I split my data 75%-25% and also split red and white. Acuracy are around 71-75% , is this good ?