This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Connect applies a standard set of weightings to different categories of information (people, terms etc.) when returning search results. When combined with likes/dislikes, these determine the order in which results are returned - details below:
Alteryx Connect uses the following scoring parameters for the Lucene engine:
Likes and Dislikes, using the following formula: (Number of Likes) / (Number of Likes + Number of Dislikes).
Certified assets: +1.2
Report sheet: +1.8
Alteryx workflow: +1.5
It would be useful to have control over this weighting, e.g. when you have large numbers of Person records being returned before Terms; but advice from Customer Support has been that these are not currently customisable. I'd like to request that this ability be considered for inclusion in a future release of Connect.
When Alteryx connect is first installed to a company with a small alteryx designer base, you do not benefit from lineage.
There are not much workflows at hand. So in order to realize Alteryx connect's immediate benefits I'd like to suggest;
a company-wide Data Quality Score.
Let's score each data element in distributed data stores
And automatically give a simple scale between one and five
1 equals to, “we don’t know”
2 data is entered or updated prior to 1 year, has conflicting data
3 would be the norm and means customer provided this data, as accurate and as up-to-date as they have entered it and ‘agreed’ to share with you.
4 means we cross checked the data with 3rd party sources or the addresses work in Google Maps”.
5 equals to “we had the customer or the representative validated the address in last 3 months”.
The scale will be based on;
Information value (variance is high or not, if there is no variance no info useful thru the column)
How many times that column is addressed in other tables
Format (structured like a telephone number ###-##-## or semi structured like an address)
Is it an ID column
Is it a Datetime column, any discrepancies in date time columns etc.
Time since last update of data
Once we have some lineage information than we'll weight th data based on how frequently it's needed, how many formulas are requiring the field etc.
And as soon as we install connect we'll have a grand vision of our data and even we'll be able to track the status of our whole distributed data assets with a trend line if we are going better or worse... Here is an example;