Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Make Categorical Data Numeric

cpituley
8 - Asteroid

Hi Alteryx Community, 

 

I am trying to take a categorical data set and transform it into a purely numeric. Edit: I should add that this doesn't need to be ordinal since there is no relation between variables in a given column. 

 

I am thinking this involves the following steps: 

 

1) Remove duplicates from each column

2) Assign a unique ID (starting base 0) to each unique value in each column

3) Map the original data set back to the unique values

 

Example as follows: 

 

Before

Field 1 | Field 2 | Field 3

abc      | def       | ghi

abc      | def       |stu

jkl        | def       | ghi

jkl        | pqr       | stu

mno    | pqr       | ghi

mno    | pqr       | zzz

 

After:

Field 1 | Field 2 | Field 3

0          | 0          |0

0          | 0          |1

1          | 0          |0

1          | 1          |1

2          | 1          |0

2          | 1          | 2

 

Looking forward to your help. 

 

Thank you,

Cameron

3 REPLIES 3
CharlieS
17 - Castor
17 - Castor

I've attached a couple of solutions for scenarios like this. 

 

"UniqueCategorical.yxmd" will assign a numeric value (1+) to unique string characters in each field. This solution relies on the Tile tool configured to unique value. To start the assignment with 0, use a Formula tool after the tile tool to update the Tile_SequenceNum field to [Tile_SequenceNum]-1.

 

If you'll be using these values in statistic testing, I have another solution that converts string field values to binary flags for each field/value combination. This is the "DynamicDummies.yxmd" attachment. 

cpituley
8 - Asteroid

@CharlieS

 

This is brilliant. Your unique categorical is exactly what I was looking for. I was missing the tile tool set to unique. I was trying to use the unique tool and then find and replace. 

 

Cheers,

 

Cameron

irmb
7 - Meteor

Hello guys, I am very  new to Alteryx. I also have to convert some categorical values into numerical in order to do a linear regression. I am having difficulty applying the workflow provided. The issue is this:

I have two excel spreadsheets with thousands of data. One spreadsheet has names of people, state, address, country. The other table has names of people, products purchased, product category, sizes. I want to convert the product category, country, product category and size variables into numerical variables.  When I join both tables,  I don't know where the attach the workflow provided . Do I apply it to the join? Or do I apply the workflow to each separate table? The problem is that one of the tables and the combined table has one-to-many relationships. For example, one customer has bought products in different categories, etc. So the "product category" has fields where the categories repeat. How do I approach doing the conversion?

 

Thanks

Labels