Unique tool
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi all.
It seems to me that Unique alters the order of the rows (sorts them). I guess it's handy for the tool to sort for quick execution but I would rather preserve the original order of rows in some cases. It doesn't explicitly say that it sorts the rows. The same goes for summarize tool when you group by. What expectation do you have?
Best regards//Leif
Solved! Go to Solution.
- Labels:
- Common Use Cases
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Imagine that you have a phone book sorted by phone number and you wanted to remove any duplicate addresses. The unique tool would sort by address to find the unique values.
In order to revert back, I would place a sort after the unique tool and put the data back in sort order. If before the unique you put a record id tool, then you would sort on record id and you're back in business.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your reply.
Yes, that is what I do when I need to preserve order, but I was more looking for a discussion on "is it is unreasonable to expect the row-order preserved" unless the tool explicitly says that it will "sort AND filter out duplicates".
Best regards//Leif
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I believe so. By taking unique values out of the data set, the data order has fundamentally changed, so you shouldn't expect it to look the same. The tool has to sort in order to place duplicate items together. By deleting rows, the original sort order doesn't really make sense any more.
However, if you place a sort before the unique tool, you can control the output. I use this method to prioritize laboratory result duplicates or updates. The attached workflow shows how you can use the sort-first to prioritize latest entries or positives.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Philip.
Thanks for illuminating this use-case. I've used summarize (first on strings and group on key) for this purpose before, but this is handy when having composite keys. Good one.
But I disagree with the notion that removing rows changes the order, in my view it doesn't. The order is only compromised if the sequence of any two rows is interchanged.
Consider soldiers in an alphabetical row. If soldiers with names beginning with the letter "C" and "F" are removed the alphabetical order is still intact.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
As others have pointed out, Alteryx needs to sort to perform the unique in the 1st place. Remember that Alteryx needs to work just as well with 10 billion records as it does with 10 records. The expense of putting the records back in order would be high and for people who don't need that, why bother?
That said, Alteryx is super flexible. I built a quick (rough) macro version of the unique tool that works you expect it. It is totally reasonable to do for smaller data sets. So enjoy this macro and then know that in the future it is pretty easy to make Alteryx work the way you want it to.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your support Ned. What would be super neat is a check box that you use if you want to preserve row order (at the expense of performance of course).
That way we could solve two things.
1. Set expectations on what to get.
2. Allow us to easily circumvent a behavior that in some cases is unwanted.
I know that check-boxes kills the beauty and that it is easy solve it by other means so don't take my request to seriously :-)
I guess it all depends on whom the next user of unique will be, a super-experienced data modeler or scientist or a business analyst trying to get out of excel hell.
Best regards///Leif
