This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Can someone please help me understand & interpret the 'distance' metric in the AB Controls tool? I am selecting control stores and am having trouble explaining the metric. I know that it represents the distance between the treatment and control units, but not sure of the scale (assuming closer to 0 is better, etc).
I'm guessing the distance is a unitless calculation in abstract Euclidean space: if n is the number of features, then we're talking about the distance in n-dimensional space from the point of the unknown (treatment?) to the nearest known (control?) unit.
I'm not sure that being closer to zero is "better" than something only slightly farther away, but on the other end, the farther you are from zero, the more likely it is your unknown point might be an outlier.
In kNN clustering, you would predict an unknown based on the k nearest knowns. It's a slow algorithm since every prediction requires you to look at the entire "known" set and find which k are closest. (But who knows... here is some reading for the fearless. :-))
One concern with this sort of calculation is that of weight: you may wish to standardize/normalize your data prior to running through any sort of nearest-neighbor calculation. It should also be known that "distance" can be whatever kind of function you want. Euclidean is easiest to visualize, but if you google it, there is other stuff in use too.