This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
When Alteryx connect is first installed to a company with a small alteryx designer base, you do not benefit from lineage.
There are not much workflows at hand. So in order to realize Alteryx connect's immediate benefits I'd like to suggest;
a company-wide Data Quality Score.
Let's score each data element in distributed data stores
And automatically give a simple scale between one and five
1 equals to, “we don’t know”
2 data is entered or updated prior to 1 year, has conflicting data
3 would be the norm and means customer provided this data, as accurate and as up-to-date as they have entered it and ‘agreed’ to share with you.
4 means we cross checked the data with 3rd party sources or the addresses work in Google Maps”.
5 equals to “we had the customer or the representative validated the address in last 3 months”.
The scale will be based on;
Information value (variance is high or not, if there is no variance no info useful thru the column)
How many times that column is addressed in other tables
Format (structured like a telephone number ###-##-## or semi structured like an address)
Is it an ID column
Is it a Datetime column, any discrepancies in date time columns etc.
Time since last update of data
Once we have some lineage information than we'll weight th data based on how frequently it's needed, how many formulas are requiring the field etc.
And as soon as we install connect we'll have a grand vision of our data and even we'll be able to track the status of our whole distributed data assets with a trend line if we are going better or worse... Here is an example;