Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to bin data?

mkhtran
9 - Comet

I have some data in the form of

 

Field1 (string)Field2 (int)
foo12
bar78
baz62

 

I'd like to add another Field that depends on Field2, which would work similar to a Formula tool with

 

Output Column: Category (string)

 

IF     [Field2] <= 25  THEN "TINY"
ELSEIF [Field2] <= 50  THEN "SMALL"
ELSEIF [Field2] <= 75  THEN "MEDIUM"
ELSEIF [Field2] <= 100 THEN "BIG"
ELSE                        "HUGE"
ENDIF

 

 

The problem is that my categories and cutoffs are coming as data from elsewhere in the workflow, like

 

Category LevelCategory NameCategory Cutoff
1TINY25
2SMALL50
3MEDIUM75
4BIG100
5HUGE999

 

What's the best way to achieve this?

5 REPLIES 5
Prometheus
12 - Quasar

@mkhtran The first thing I did was create a field that showed the difference between a category cutoff and the next category cutoff. Next, I used a Generate Rows tool to create a row for every Category Cutoff until it created as many rows as was the difference. After that, I used a RecordID tool and then joined the original dataset to this data on Field2 (int) to RowID.

Difference.PNG

Gen up to Diff.PNG

Join Configuration.PNG

Output.PNG

 

Qiu
21 - Polaris
21 - Polaris

@mkhtran 
I would use the table of your cutoff to generate the range of cutoff for Hi and lo then use appen to generate all the combinations.

Then filter tool should filter out the records matching your criteria.

0929-mkhtran.PNG0929-mkhtran-A.PNG

OllieClarke
15 - Aurora
15 - Aurora

@mkhtran 

Here's another approach using the Dynamic Replace tool, which doesn't require cross-joining your data

 

image.png

 

I use the multi-row to generate a boolean expressionimage.png

 

Having already created a blank Category field in the original data stream, and written its name as data in the second stream, I can replace that value using the dynamic replace tool:

 

image.png

@Qiu's approach is easier, and requires fewer tools, but if you have a big dataset and lots of category values, you will create a lot of unnecessary data.

Ollie

Qiu
21 - Polaris
21 - Polaris

@OllieClarke 
That is brilliant!

I am always not good at Dynamic replace and you have provide a very good sample.

mkhtran
9 - Comet

I ended up with a solution similar to @Qiu's which worked ok for my dataset, but the Dynamic Replace is new to me and I definitely want to check it out now.

Labels