A general question regarding high vs. low selectivity. I understand how to determine which fields should be high vs. low selectivity, based on the general guidelines described in the Alteryx Help.
However, if I only have a .cydb file, is there any way I can tell which fields have been indexed according to high vs. low selectivity rules if I don't have access to the workflow in which the Calgary database was initially created?
Also, more generally, can anyone describe what procedures are actually being run within Alteryx to optimize the indexing based on selectivity? In other words, if I categorize a low selectivity field (i.e. "day of the week") as "high," my workflows will run more slowly, but why? (I'm just curious).
Thanks!
Solved! Go to Solution.
The values that are Low Selectivity are found in [CYDB FILENAME]_Indexes.xml. It is not neatly formatted for the naked eye. You can also look at the directory where you wrote the cydb and you'll see all of the indexes (.cyidx) there. If the size is small, it is low selectivity and if the size is large you can guess which one that is. I have a file with a 5GB data file and the indexes range from 30 KB (low) to 245 KB (high).
I hope that this helps,
Mark
@MarqueeCrew Yes, that is exactly the information that I was looking for; thank you!