Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Data Quality Check

SahilArora3139
8 - Asteroid

Hi

i need  help to check the data quality of my input file: (excel /csv) (also workflow needs t be stopped if it doesn't meet any of the parameter mentioned in the sheet)

 

I have created the sample workflow for the same.

But need to build a macro for the same.

please help me for the same.

also i have attached an excel to show my sample file columns parameters.

 

1) Based on my workflow ,please help me out to build a macro for this

2) based on the excel sheet how can i proceed with workflow and macro

 

kindly suggest any tool that can be used to filter out these quality parameters in the data set. (Csv/excel/txt).

 

1 REPLY 1
danilang
19 - Altair
19 - Altair

Hi @SahilArora3139 

 

The technique I use to handle invalid data is this

danilang_0-1680612920917.png

 

The top branch is where the testing logic goes.  The testing algorithm is always configured so that the result [ErrorFlag] is a single row with 0=No Errors and any other value = Error.  In this simple example I just count the number of empty values and use this as [ErrorFlag] as well as using a Message tool(Warning) to output the result if ErrorFlag!=0.  Real life scenarios involve multiple tests, but if you configure each of them to return 0 if no error and and a positive value if there is an error, you just have to sum the all the individuals values to give you a final [ErrorFlag]    The bottom branch adds [ErrorFlag] to the input data.  This is joined to the top branch on [ErrorFlag].  The result of this join is that either all the records pass if the input is error free or no records pass.  Remove [ErrorFlag] in the Select tool.

 

In general, users want the the workflow to write an output if the input is valid and do nothing if the input is not.   You can do this with a message set to Error and Stop Passing Record, but this will stop the workflow when the first Error is found.  If you want the workflow to evaluate and report on all the validation errors in one run, you need to use a different strategy.  The key to this strategy is in the configuration of the Output Data tool.  If the tool is configured to use a static file name, the file will always be overwritten even if the input has no rows.  If the Output Data tool is setup to Take File/Table name From Field, and there are no rows coming into the tool, the target file won't be changed since there is no file information in the input.  This coupled with the validation logic passing either all the rows or none of them, ensures that the output file only gets modified if the input data is valid.  

 

Here are the results of two runs, one with valid data and the other with a missing input value

danilang_1-1680612303117.pngdanilang_2-1680612336926.png

Notice that in the second run the workflow completes, but no rows get written to the output file.  

 

To convert the validate logic to a standard macro, add a macro input and a macro output to return the results 

danilang_3-1680612521746.png

Insert the macro into your main workflow after the Input Data tool.  

danilang_4-1680612612368.png

 

Dan

 

 

 

 

 

Labels