Hello Alteryx Community,
I’ve built a workflow that processes datasets to filter out invalid data based on specific requirements. While the current workflow works fine for smaller datasets, I’m looking for ways to simplify and modularize it to handle a larger and more varied set of inputs.
Current Workflow:
Inputs: The workflow takes datasets and requirements as inputs. For example:
Id | colour | weight | sweetness |
apple1 | red | 5 | 1 |
apple2 | yellow | 4 | 10 |
apple3 | green | 6 | 9 |
apple4 | blue | 5 | 6 |
apple6 | 20 | 10 |
Id | Length | Width | Height | Weight |
box1 | 10 | 20 | 30 | 50 |
box2 | 15 | 30 | 20 | 40 |
box3 | 20 | 15 | 15 | 10 |
box4 | 5 | 5 | 5 | 100 |
FileName | Field | Type | Min | Max | Classes | emptyOk |
appleRecord | colour | Class | red,yellow,green | FALSE | ||
appleRecord | weight | Range | 1 | 10 | FALSE | |
appleRecord | sweetness | Range | 5 | FALSE | ||
boxRecord | Length | Range | 15 | 30 | FALSE | |
boxRecord | Width | Range | 15 | 30 | FALSE | |
boxRecord | Height | Range | 15 | 30 | FALSE | |
boxRecord | Weight | Range | 15 | 30 | FALSE |
Field Out of Range: SWEETNESS | ||||
RecordID | Id | colour | weight | sweetness |
1 | APPLE1 | RED | 5 | 1 |
Field Out of Range: WEIGHT | ||||
RecordID | Id | colour | weight | sweetness |
5 | APPLE6 | 20 | 10 | |
Invalid Type: COLOUR | ||||
RecordID | Id | colour | weight | sweetness |
4 | APPLE4 | BLUE | 5 | 6 |
5 | APPLE6 | 20 | 10 |
Field Out of Range: HEIGHT | |||||
RecordID | Id | Weight | Length | Width | Height |
4 | BOX4 | 100 | 5 | 5 | 5 |
Field Out of Range: LENGTH | |||||
RecordID | Id | Weight | Length | Width | Height |
4 | BOX4 | 100 | 5 | 5 | 5 |
1 | BOX1 | 50 | 10 | 20 | 30 |
Field Out of Range: WEIGHT | |||||
RecordID | Id | Weight | Length | Width | Height |
4 | BOX4 | 100 | 5 | 5 | 5 |
2 | BOX2 | 40 | 15 | 30 | 20 |
1 | BOX1 | 50 | 10 | 20 | 30 |
3 | BOX3 | 10 | 20 | 15 | 15 |
Field Out of Range: WIDTH | |||||
RecordID | Id | Weight | Length | Width | Height |
4 | BOX4 | 100 | 5 | 5 | 5 |
While the workflow functions as intended for small datasets, I’m aiming to make it more flexible and reusable by addressing the following points:
I’ve attached the original workflow for reference. Any suggestions on how to achieve these improvements would be greatly appreciated!
Thank you!