Hi Alteryx Community,
I am trying to take a categorical data set and transform it into a purely numeric. Edit: I should add that this doesn't need to be ordinal since there is no relation between variables in a given column.
I am thinking this involves the following steps:
1) Remove duplicates from each column
2) Assign a unique ID (starting base 0) to each unique value in each column
3) Map the original data set back to the unique values
Example as follows:
Before
Field 1 | Field 2 | Field 3
abc | def | ghi
abc | def |stu
jkl | def | ghi
jkl | pqr | stu
mno | pqr | ghi
mno | pqr | zzz
After:
Field 1 | Field 2 | Field 3
0 | 0 |0
0 | 0 |1
1 | 0 |0
1 | 1 |1
2 | 1 |0
2 | 1 | 2
Looking forward to your help.
Thank you,
Cameron