Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Compare two datasets and flag rule violation

7 - Meteor

Hello everyone, 

Been stuck on this problem for a while now. I am not able to think through the logic and implementation using alteryx to achieve the desired results. 


There's two data sets, given below-

Data  set 1


Risk IDRule IDCode

The combination of risk and rule ID in the first data set leads to the unique list of codes. 


Data set 2


Users are tagged with codes that they use.


When they use codes that are in the same Risk and Rule ID family, it is a violation. 

I just want to be able to flag those users who have violated this rule. 


In the above data sets, User1 has violated the rule as he used two codes (FBREV and FBREV1). Both of the codes fall under the (FR01+ 00Q) rule. Hence I need to be able to tag User1 as a rule violator. 


I have been breaking my head over this for a few days now.

Any help is highly appreciated.  


Thanks so much in advance. 



Alteryx Certified Partner


I would join the two datasets together on Code.  then from this joined dataset, I would run a summarize on Column1, RiskID and RuleID and then count the records.  Group by on Column1, RiskID and RuleID, count on Column1.  Run a filter to select any records where count is >1.  Those are Users which have a violation.  from this list of violation records, you can join back to dataset#2 and flag the records that join with a violation.

Hope this helps.  Let me know if you need any clarification or further help.


7 - Meteor

Hey Maureen, thanks so much for your response. 

I tried what you had suggested below


Data set 1 has 162,755 records

Data set 2 has 10.8M records


and the join on Code ended up giving 717M records. 


Since we just joined on Code, and there are many codes with different Risk ID and Rule ID, we end up with a many to many situation which might not really be needed. 


Thank you,



Alteryx Certified Partner

The best way to eliminate unneeded joins is to run a summarize (or unique) on both datasets before the join.  Dataset1 - group by riskid, ruleid and code.  Dataset2 - group by on Col1 and code.  The resulting joins are expected.

Hope that makes sense unless I am misunderstanding the rules around the data.

Let me know if you still have a question.

7 - Meteor

Hey Maureen, thanks so much!! that worked. it was able to identify everything correctly. thanks!!!!!!