We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Dynamic profiling tool

Eivind
6 - Meteoroid

Just arrived in the Alteryx community - Happy to be here!

 

I am attempting to build a data profiling tool for an SQL database, that can read a variable amount of tables with an uneven number of columns in each table. The workflow should then be able to run a set of operations on each column (count blanks, count distinct, etc...) and output the results for each column into an Excel spreadsheet or similar.

 

I have done some testing with the input and dynamic input tools, and arrived at the workflow underneath. As you can see, there is an input node that retrieves the user-defined field and table-names in the database, and feeds them into a dynamic input node that loops through all fields and tablenames. However, when I try to link this up to anything else, I get trouble with several error messages, as well as the fact that I cannot find a way to dynamically feed the output from my dynamic input node into a formula node.

 

Is there anything I have overlooked in how to do this, or any alternatives that would be more sensible to make my dynamic profiling tool?

 

Capture.PNGHi,

 

 

 

 

4 REPLIES 4
jdunkerley79
ACE Emeritus
ACE Emeritus

Interesting idea - will play with at Lunch

 

Think you might need to be careful around different field types. I think you might need a couple of different paths for each of the type.

 

 

 

 

 

 

jdunkerley79
ACE Emeritus
ACE Emeritus

Attached my lunchtime playing.

 

You need to cast everything a fixed type then you are ok.

 

I cast both TableName and ColumnName in SQL to a nvarchar(256).

The value I cast to a 2048 block string. One optimsation would be to read max length needed from the input columns and set that.

 

The template query I used was:

 

Select 
CONVERT(NVARCHAR(256), '[GLOBAL_Matter]') TableName, 
CONVERT(NVARCHAR(256), '[CaseCategory_ID]') ColumnName, 
CONVERT(NVARCHAR(2048), [GLOBAL_Matter].[CaseCategory_ID],120 ) Value 
From [GLOBAL_Matter]

I adjusted the SQL table column query to:

select '[' + TABLE_SCHEMA + '].[' + TABLE_NAME +']' TableName,
       '[' + COLUMN_NAME + ']' ColumnName,
	   DATA_TYPE, 
	   CHARACTER_MAXIMUM_LENGTH, 
	   NUMERIC_PRECISION
from INFORMATION_SCHEMA.COLUMNS

 

By surrounding with [] I think safer for this one.

 

Hope this helps!

Eivind
6 - Meteoroid

Thanks for your advice! This seems a good step closer to a solution. However, I keep getting an error message in relation to the [Global_Matter]. Any thoughts?


Capture.PNG

jdunkerley79
ACE Emeritus
ACE Emeritus

Was just the sample table I used.

 

Have amended to:

select CONVERT(NVARCHAR(256), 'INFORMATION_SCHEMA.COLUMNS') TableName, 
CONVERT(NVARCHAR(256), 'COLUMN_NAME') ColumnName,
CONVERT(NVARCHAR(256), COLUMN_NAME, 120) Value
from INFORMATION_SCHEMA.COLUMNS

You will need to change connection strings in both input and dynamic input (as points to my laptops local SQL Server)

 

Otherwise should work out of the box now I hope

Labels
Top Solution Authors