Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

"Guessing" Age

bvolles
8 - Asteroid

Hello all,

I am new here and to alteryx so pardon me if I ask some real obvious questions.  I have a data set that contains ages of passengers but not all the ages are populated.  I was wondering is there a way to "guess" what those missing ages could be?  It was mentioned the impute tool would only provide a crude estimate while I am wanting something semi accurate.

 

Thanks 

6 REPLIES 6
MSalvage
11 - Bolide

@bvolles, 

 

This sounds like you are wanting to essentially predict the ages of passengers. If this is the case, you will want to use Alteryx's predictive tools to build a model. Whether or not the model is efficient or not will depend on what other data fields you have on the passengers.

 

Including sample data would help people be able to generate examples for you. 

 

Good Luck, 

MSalvage

bvolles
8 - Asteroid

My apologies.  I have attached my workflow

NickDuncan
7 - Meteor

Hi bvolles,

 

Yes you could use a Linear Regression tool to estimate the ages based on other fields in your data. 

 

The trick would be understanding which other fields could be useful in predicting age. 'Title' would be a good one, a passenger named 'Master' is likely to be a young boy, 'Miss' is likely to be a young girl or woman, 'Mrs' is likely to be an older married woman, etc. Your parsing isn't perfect though, so 'Title' is causing an error in the regression. Also, 'sibsp' might help predict age, since the more siblings on board you have, the more likely it is that you are travelling with a large family, and are likely one of many children. You can mess around with the variables to improve your prediction.

 

The investigation tools in Alteryx help you understand which predictive variables you could use. Scatterplot and Association Analysis tools are good ones - look for correlation and p-values < 0.05.

 

Then, feed the passengers with known ages into a Linear Regression tool with your chosen variables and estimate (Score tool) the ages of the rest. Union your final dataset together and you'll have estimated ages that will be much better than a simple average.

 

Workflow attached.

 

ages.png

bvolles
8 - Asteroid

Thank you very much.  

bvolles
8 - Asteroid

NickDuncan it says the macros are inaccessible or unavailable?

NickDuncan
7 - Meteor

 

There shouldn't be macros.. Make sure you have have predictive tools installed.

 

http://downloads.alteryx.com/predictive.html

Labels