Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

one hot encoding for semicolon separated variables within a column

alpro-23
7 - Meteor

 

Hi all, 

 

I'm new to Alteryx and I'm trying to learn how to do one-hot-encoding for a chain of variables within one column. 

 

For example, the column could be 'allergies', and each value would be different for each rows, separated by a semicolon. 

 

ie.

row 1: peanut; shellfish

row 2: shellfish; soya bean

row 3: peanut

row 4: shellfish; soya bean; peanut

 

My end result hope to have numerous columns that are specific to the unique values within the 'allergy' column. (ie. one hot encoded).  

 

The equivalent for python is this:

 

 

 

 

 

dummy_df = df['col_of_interest'].str.get_dummies(sep=";")

 

 

 

 

 

 

Would appreciate if someone could help. Thank you. 

1 REPLY 1
MarqueeCrew
20 - Arcturus
20 - Arcturus

@alpro-23  ,

 

https://gallery.alteryx.com/#!app/CReW-Generate-Dummy-Variables/5fca958e0462d71998cd0aac 

 

I created a macro to perform this function taking one variable at a time. 

cheers,

 

 mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels