Hi all.
It seems to me that Unique alters the order of the rows (sorts them). I guess it's handy for the tool to sort for quick execution but I would rather preserve the original order of rows in some cases. It doesn't explicitly say that it sorts the rows. The same goes for summarize tool when you group by. What expectation do you have?
Best regards//Leif
Solved! Go to Solution.
Thanks for your reply.
Yes, that is what I do when I need to preserve order, but I was more looking for a discussion on "is it is unreasonable to expect the row-order preserved" unless the tool explicitly says that it will "sort AND filter out duplicates".
Best regards//Leif
I believe so. By taking unique values out of the data set, the data order has fundamentally changed, so you shouldn't expect it to look the same. The tool has to sort in order to place duplicate items together. By deleting rows, the original sort order doesn't really make sense any more.
However, if you place a sort before the unique tool, you can control the output. I use this method to prioritize laboratory result duplicates or updates. The attached workflow shows how you can use the sort-first to prioritize latest entries or positives.
Hi Philip.
Thanks for illuminating this use-case. I've used summarize (first on strings and group on key) for this purpose before, but this is handy when having composite keys. Good one.
But I disagree with the notion that removing rows changes the order, in my view it doesn't. The order is only compromised if the sequence of any two rows is interchanged.
Consider soldiers in an alphabetical row. If soldiers with names beginning with the letter "C" and "F" are removed the alphabetical order is still intact.
As others have pointed out, Alteryx needs to sort to perform the unique in the 1st place. Remember that Alteryx needs to work just as well with 10 billion records as it does with 10 records. The expense of putting the records back in order would be high and for people who don't need that, why bother?
That said, Alteryx is super flexible. I built a quick (rough) macro version of the unique tool that works you expect it. It is totally reasonable to do for smaller data sets. So enjoy this macro and then know that in the future it is pretty easy to make Alteryx work the way you want it to.
Thanks for your support Ned. What would be super neat is a check box that you use if you want to preserve row order (at the expense of performance of course).
That way we could solve two things.
1. Set expectations on what to get.
2. Allow us to easily circumvent a behavior that in some cases is unwanted.
I know that check-boxes kills the beauty and that it is easy solve it by other means so don't take my request to seriously :-)
I guess it all depends on whom the next user of unique will be, a super-experienced data modeler or scientist or a business analyst trying to get out of excel hell.
Best regards///Leif