Alteryx Designer Desktop Discussions

hcho

Hello Alteryx Community,

I’ve built a workflow that processes datasets to filter out invalid data based on specific requirements. While the current workflow works fine for smaller datasets, I’m looking for ways to simplify and modularize it to handle a larger and more varied set of inputs.

Current Workflow:

Inputs: The workflow takes datasets and requirements as inputs. For example:

Dataset 1: appleRecord
- Id colour weight sweetness
  apple1 red 5 1
  apple2 yellow 4 10
  apple3 green 6 9
  apple4 blue 5 6
  apple6 20 10
Dataset 2: boxRecord
- Id Length Width Height Weight
  box1 10 20 30 50
  box2 15 30 20 40
  box3 20 15 15 10
  box4 5 5 5 100

Requirements (Input): Each dataset has specific validation criteria for both class and range fields, as defined in a requirements table:

FileName	Field	Type	Min	Max	Classes	emptyOk
appleRecord	colour	Class			red,yellow,green	FALSE
appleRecord	weight	Range	1	10		FALSE
appleRecord	sweetness	Range	5			FALSE
boxRecord	Length	Range	15	30		FALSE
boxRecord	Width	Range	15	30		FALSE
boxRecord	Height	Range	15	30		FALSE
boxRecord	Weight	Range	15	30		FALSE

Outputs: The workflow generates an Excel file with two sheets, one for each input dataset, listing the invalid records. The original data format is retained, with an added RecordID column. Here's how the output looks:

Sheet 1: appleRecord

Field Out of Range: SWEETNESS
RecordID	Id	colour	weight	sweetness
1	APPLE1	RED	5	1
Field Out of Range: WEIGHT
RecordID	Id	colour	weight	sweetness
5	APPLE6		20	10
Invalid Type: COLOUR
RecordID	Id	colour	weight	sweetness
4	APPLE4	BLUE	5	6
5	APPLE6		20	10

Sheet 2: boxRecord

Field Out of Range: HEIGHT
RecordID	Id	Weight	Length	Width	Height
4	BOX4	100	5	5	5
Field Out of Range: LENGTH
RecordID	Id	Weight	Length	Width	Height
4	BOX4	100	5	5	5
1	BOX1	50	10	20	30
Field Out of Range: WEIGHT
RecordID	Id	Weight	Length	Width	Height
4	BOX4	100	5	5	5
2	BOX2	40	15	30	20
1	BOX1	50	10	20	30
3	BOX3	10	20	15	15
Field Out of Range: WIDTH
RecordID	Id	Weight	Length	Width	Height
4	BOX4	100	5	5	5

While the workflow functions as intended for small datasets, I’m aiming to make it more flexible and reusable by addressing the following points:

Dynamic File Handling: Instead of manually connecting input files, I want the workflow to dynamically pull data based on the requirements table and file paths. This would allow the workflow to scale with varying numbers and sizes of datasets.
Simplification: The current workflow has repetitive actions. I’m seeking advice on how to streamline this process.
Scalability: The example datasets are simplified for demonstration, but I need the workflow to be applicable to more complex and larger datasets.

I’ve attached the original workflow for reference. Any suggestions on how to achieve these improvements would be greatly appreciated!

Thank you!

Id	colour	weight	sweetness
apple1	red	5	1
apple2	yellow	4	10
apple3	green	6	9
apple4	blue	5	6
apple6		20	10

Alteryx Designer Desktop Discussions

How to Simplify and Modularize a Workflow for Dynamic Data Processing