Dear Alteryx community,
Probably this is a very basic task. I searched for quite a bit but couldn't find a solution. Imagine I have the following table:
Personnel ID | Registration Date |
45 | 2018-02-21 09:35:11 |
35 | 2015-09-08 04:00:23 |
45 | 2018-03-06 07:59:00 |
Each person should only have one row in this table but for #45, there are 2 rows. The older date is wrong and needs to be excluded. I only want to include the most recent date. How to do this in Alteryx without having to program a long workflow? sample file or a tutorial link would be much appreciated.
Thank you so much
Solved! Go to Solution.
Hi,
I'm currently installing a new version of Alteryx, so running without designer up, but here's the theory on two methods for this (although there are others):
Method 1: Summarize and Join
Using a Summarize tool, you can Group By Personnel ID and take the MAX Registration Date. Then you can join back to your main data set on Personnel ID = Personnel ID and Max_Registration Date = Registration Date, and deselect any duplicate fields.
Method 2: Sort and Sample
Sort your dataset by Personnel ID and Registration Date (Ascending or Descending). Then, you can use the Sample tool, Grouped by Personnel ID, and select the Last (if sorted Ascending) or First (if sorted Descending) 1 record. This will get you one record per personnel id.