Let’s talk Alteryx Copilot. Join the live AMA event to connect with the Alteryx team, ask questions, and hear how others are exploring what Copilot can do. Have Copilot questions? Ask here!
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Basic Predictive Model Help

Jenny_Vu
7 - Meteor

Hello everyone! I am a student learning Alteryx - beginner level. I tried to build this basic model but the data stops flowing at a certain point. I am very lost, I don't fully understand what I have done despite searching and reading available resources. Please help.

Also, please let me know if someone is keen to be an Alteryx personal tutor. If you can share how to learn Alteryx most effectively for building marketing-related models, it would be much appreciated! Thank you in advance 🙏

15 REPLIES 15
Jenny_Vu
7 - Meteor

Hello! Thank you for your point of view and explanation. Is this how you think it should be (I followed your guide as how I understand it and added the last bit of Qiu's model because it summarize the final output well)? I also added a forest model to predict sales. Please let me know if the email response rate logistic prediction one is as you explained. Also, may I ask how to solve this error?

 

Screenshot 2025-05-18 233502.png

KGT
13 - Pulsar

Yep, that's better. A couple of minor things to get you moving. I'll see if I get a chance to update the workflow and post in a couple of hours.

 

In your Multiple Join, you want to

  • Select to add [Catalogue Type] from Input #3 as that's one of your created fields.
  • Drop Connection 4 as you have your entire dataset coming through Input 2 with the additional fields created on that stream. (No harm though as the fields aren't selected)

In Join(38), (the normal join tool before the predictive tools)

  • Join on Household Type == Household Type.
  • Then Take the output from the J connector

Then there is the predictive model. I haven't run through, but from what I am seeing on a quick glance, the Field Types need to be adjusted. You have some continuous fields coming through as strings causing category errors on the score. If they are strings, they will be compared as categorical variables, and so 10.1234 will be a different category to 10.1244, most likely meaning the entire field is unique. I love using the field summary (Data Investigation Palette) I (Interactive) output to look at my fields and work out whether they look like they have too many categories or are too unique.

 

 

KGT
13 - Pulsar

I had a look and adjusted a few things.

 

There was a field called Responded_Binary being used in the model... it was a duplicate of your target... hence the exact model.

Jenny_Vu
7 - Meteor

Thank you for your explanation! I asked Chatgpt so many times and looked up Alteryx resources but couldn't figure out that the data types of certain fields were the problem!!

 

My I ask why have to join on Household Type = Household Type and not Customer ID = Customer ID?

Jenny_Vu
7 - Meteor

Thank you very much for taking time to demonstrate to me how it should be done! Much appreciated. It helps a lot as I can compare this one with my own and understand better what you explained earlier too!

Should I remove the field Responded_Binary? Also, may I ask why you still keep connection #4 which is now #1 in your model (as you explained earlier that it is not needed)?

 

Many thanks.

KGT
13 - Pulsar

Happy to help. I love the low barrier to entry for Predictive Analytics in Alteryx, but I'm passionate about helping people understand what techniques are needed to make something valuable.

 

I think I was re-configuring while typing and waiting for another flow to finish running. I think I said Connection 4 is not needed, because you had all that data on another connection, however I ended up taking the original data and removing all that from the other connections. This method was to better show that the Feature engineering was adding fields to the data.

 

And yes, you can remove Responded_Binary. Removing it is not super important, but removing from the predictors in the model is necessary. The Association Analysis (Data Investigation palette) output on the I connector will help you interpret which variables are too highly correlated to include. If a field is too highly correlated, then you are essentially telling the model what the response will be rather than predicting what it should be.

 

As for the join on Household Data, that added field is grouped to Household data and so no longer has Customer ID. It wouldn't make sense to still have Customer ID at that point. This is adding on features that are grouped to the Household level.

Labels
Top Solution Authors