Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Data QA

Ojay
8 - Asteroid

I'm trying to put together an Alteryx workflow which I can use on a regular basis which can serve as a standard for the QA of data contained in Excel files rather than carrying out the QA in Excel before transferring over to Alteryx.

 

Any help with this will be appreciated.

 

Thanks.

 

Olu

16 REPLIES 16
Thableaus
17 - Castor
17 - Castor

Hi @Ojay 


There a lot of tools under the "Data Investigation" tab in Alteryx that can do QA of your data.


Maybe if you provided us an example of how you're used to do it in Excel, we could suggest some tips to do it in Alteryx.


Cheers,

Ojay
8 - Asteroid

Thanks for your reply.

 

In Excel, I would go through every column individually to check for the following:

 

  • Date format (is correct)
  • Text within metrics (not present)
  • Leading & Trailing spaces
  • Duplication of entries within columns especially drop down lists
  • Data type (is correct)

Kind regards

Thableaus
17 - Castor
17 - Castor

@Ojay 

 

These are some example on how to do QA of your datasets:

 

ExQA.PNG

- With RegEX tool and RegEX function in formula Tool - you can identify patterns of date format and correct them. 

If you don't know what RegEX is, I recommend you to study more about it. Here are some topics in the community and a website to help you:

https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Tool-Mastery-RegEx/ta-p/37689

https://community.alteryx.com/t5/Alteryx-Knowledge-Base/RegEx-Examples-12-Handy-Use-Cases/ta-p/40680

https://www.rexegg.com/

 

- Basic Data Profile Tool hands you a lot of useful information from all of your fields

Including Leading and Trailing whitespaces, longest length, number of nulls etc.

Here is a useful topic on this tool

https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Tool-Mastery-Basic-Data-Profile/ta-p/28610

 

- Field Info tool brings MetaData info to you. Field types, field sizes, names. Here is a topic about it.

https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Tool-Mastery-Field-Info/ta-p/60723

 

- Frequency Tool is able to identify the frequency of each value on each field. Perfect to find duplicates

 

- Field Summary is similar to Basic Data Profile. It focus on the fields you select.

 

These are some tools to do Data QA. Browse Tool also has the ability to do most of what these tools do, and you can use it to do ad-hoc analysis on the quality of your data.

 

I'm appending the package with the dataset analyzed and all the tools commented. 

 

I hope I was able to clear your mind on this topic and that this can serve to boost your interest in Alteryx.


Cheers,

 

Ojay
8 - Asteroid

@Thableaus

Truly appreciated.

Will look into the links you sent as well as trying out the workflow.

Best regards

Thableaus
17 - Castor
17 - Castor

Hi @Ojay 

 

What did you think about the tools I presented you? Did you get it?

 

Cheers,

Ojay
8 - Asteroid

Hi @Thableaus 

 

I'm actually trying it now.

 

How do I resolve instances of dates not matching the required format?

 

Your help is truly appreciated.

 

Regards

Thableaus
17 - Castor
17 - Castor

@Ojay 

 

check the RegEX tool (green one in my workflow).

It's useful to check formats, patterns, etc. That's the way you can identify if a string date is in an adequate format.

 

Cheers,

Ojay
8 - Asteroid

@Thableaus: I just wanted you to help me clarify what this expression (i.e. date format) the reg-ex tool is checking in this instance?

Thableaus
17 - Castor
17 - Castor

@Ojay 

 

Yes the RegEx tool is checking if the date is in the right format.

It demands a bit of understanding of how RegEX works, but working with dates is pretty intuitive and easy.

 

Cheers,

Labels