community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Create Custom Demographic Variables in Allocate

Highlighted
Asteroid

Hi All,

 

I had a question on finding correlations within Alteryx. I have a dataset with Age, Zip Codes, Gender, and another with medical conditions. I brought in demographic variables to this dataset based off those zip codes. I'm now trying to find a correlation between some of these variables and the dataset I have. For instance, the correlation between having these medical conditions along with a variable like how often they eat out or drink alcoholic beverages. Is there a way to perform such a task and visualize it within Alteryx?

 

Alteryx Certified Partner
Alteryx Certified Partner

Responding to your topic title, here's an article on how custom variables can be created and included in the Allocate tools within Alteryx:

https://community.alteryx.com/t5/Location-Data-Knowledge-Base/Create-Custom-Demographic-Variables-in...

 

How did you bring in the additional demographic variables? Did the data provided to you have ZIP code numbers, or spatial objects? I ask because ZIP code polygons can be very different from different data providers and what your provider calls "90210" may be very different from what another provider calls "90210". Spatial objects are typically best to make sure everything has a common geographic base.

 

Calculating correlation between many variables can be done using the Data Investigation tools, specifically the Pearson Correlation and Spearman Correlation tools. 

https://help.alteryx.com/2018.3/PearsonCorrelation.htm

 

Let me know if this helps of if you have further questions.

Asteroid

Hi @CharlieS ,

 

Thanks for the response! I brought in the demographics based off of Zip Code. How can I match based off Spatial Objects if I'm only provided Zip Codes? 

 

I'm taking a look at the 2 correlation tools you mentioned below. Just as reference - I'm trying to figure out how likely it is for someone who spends a lot of money on alcohol, prescription drugs, sugar, outside food, etc. to also suffer from a medical condition. So in a column I have Yes/No responses on whether an individual suffers from a medical condition. Based off their zip code, I'm bringing in demographics on avg expenditures within that zip code for some variables I mentioned above. I'm then trying to calculate the correlation between them having a medical condition and those outside variables. I was using the Logistic Regression and Scoring tool. Not sure if that's the best way to go about it? Any tips would be greatly appreciated!

Alteryx Certified Partner
Alteryx Certified Partner

@hydrogurl01 wrote:

Hi @CharlieS ,

 

Thanks for the response! I brought in the demographics based off of Zip Code. How can I match based off Spatial Objects if I'm only provided Zip Codes? 


Happy to help! If you don't have spatial objects for the provided ZIP codes, you'll be unable to a spatial comparison. I would suggest taking a look at what your two source of data say about the population in the same ZIP because there might be some corrections you can do with that information. However, this might be inconsequential if all the variables you're working with are qualitative in nature. 

 


@hydrogurl01 wrote:

Hi @CharlieS ,

 

I'm taking a look at the 2 correlation tools you mentioned below. Just as reference - I'm trying to figure out how likely it is for someone who spends a lot of money on alcohol, prescription drugs, sugar, outside food, etc. to also suffer from a medical condition. So in a column I have Yes/No responses on whether an individual suffers from a medical condition. Based off their zip code, I'm bringing in demographics on avg expenditures within that zip code for some variables I mentioned above. I'm then trying to calculate the correlation between them having a medical condition and those outside variables. I was using the Logistic Regression and Scoring tool. Not sure if that's the best way to go about it? Any tips would be greatly appreciated!


I think Logistic regression is a very appropriate tool for this job. Let us know if you have any trouble with that or any other R-based Alteryx tools. 

Labels