I am trying to compare two data sets which have a lot of values in common. I have browsed the community but nothings is working for me. I want to compare for each employee, if they have the same Project ID and corresponding Hours on both data set. I would just need a list of where they are different. I have tried several workflows but somehow it is giving me duplicate rows within the Project ID column and I cannot further. Thanks for your help.
Data set 1
Employee Name | Emp ID | Project ID | Project Type | Hours |
A1 | 1 | 287026-21 | C | 40 |
A2 | 2 | 393417-01 | C | 40 |
A3 | 3 | 303955-04 | C | 40 |
A4 | 4 | 281392-03 | C | 32 |
A4 | 4 | 264592-92 | P | 8 |
A5 | 5 | 365196-13 | P | 40 |
A6 | 6 | 287026-21 | X | 40 |
A7 | 7 | 364636-05 | X | 40 |
A8 | 8 | 287026-38 | L | 40 |
A9 | 9 | 364636-04 | F | 16 |
A9 | 9 | 394642-08 | C | 24 |
A10 | 10 | 394642-08 | C | 40 |
A11 | 11 | 281392-03 | C | 40 |
Data set 2
Employee Name | Emp ID | Project ID | Project Type | Hours |
A1 | 1 | 287026-21 | C | 32 |
A1 | 1 | CAP/DFZ-00 | Z | 8 |
A2 | 2 | 393417-01 | C | 40 |
A3 | 3 | 303955-04 | C | 40 |
A4 | 4 | 281392-03 | C | 32 |
A4 | 4 | 264592-94 | P | 8 |
A5 | 5 | 365196-13 | P | 40 |
A6 | 6 | 287026-20 | X | 40 |
A7 | 7 | 364636-05 | X | 40 |
A8 | 8 | 287026-38 | L | 40 |
A9 | 9 | 364636-04 | F | 16 |
A9 | 9 | 394642-08 | C | 24 |
A10 | 10 | 394642-08 | C | 40 |
A11 | 11 | 281392-03 | C | 40 |
Solved! Go to Solution.
You need one tool to perform this job, a join tool, example attached...
You are interested in the data from the 'Left' and 'Right' anchors; you could then build some flags about what exactly the issue is, whether it is just completely missing from the other file, or just the amount of hours logged is different.
Ben
Superstar, it was so simple I feel terrible to have asked!
I am still getting duplicated rows in my data though so in my table it is incorrectly telling me that this employee has all those entries when actually he has 20 hours on 365336-34 and 20 hours on 365336-38 in Data set 1, it is just split differently in Data set 2. how do I get rid for these extra lines
A2 | 95,787 | Principal | C | 365336-34 | 365336-34 | 20 | 18 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-38 | 365336-38 | 20 | 18 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-34 | 365336-34 | 20 | 18 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-38 | 365336-38 | 20 | 18 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-34 | 365336-34 | 20 | 8 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-38 | 365336-38 | 20 | 8 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-34 | 365336-34 | 20 | 8 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-38 | 365336-38 | 20 | 8 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-34 | 365336-34 | 20 | 4 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-38 | 365336-38 | 20 | 4 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-34 | 365336-34 | 20 | 4 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-38 | 365336-38 | 20 | 4 | Difference in Logged Hours | |
A2 | 95,787 | Principal | C | 365336-34 | 365336-28 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-28 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-34 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-38 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-28 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-28 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-34 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-38 | Difference in Project ID | 20 | 18 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-28 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-28 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-34 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-38 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-28 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-28 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-34 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-38 | Difference in Project ID | 20 | 8 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-28 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-28 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-34 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-38 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-28 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-28 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-38 | 365336-34 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |
A2 | 95,787 | Principal | C | 365336-34 | 365336-38 | Difference in Project ID | 20 | 4 | Difference in Logged Hours |