I am creating a repetitive flow which is comparing the data within our system to files received from the customer. These files are very large (~40,000 lines).
I need to compare the children's information but some families may have more than one child. Our system has an unique identifier for each child but the customers file doesn't. Both the system and customer's file has an identifier for the entire family.
I need to find out if there are new children, children no longer on the plan, if the child's first name, last name, DOB, etc.. is different between our system and the customers file.
If I concatenate the data by the parent's identifier and join by the concatenation the children are not matched/lined up properly (as some families have more than one child). I want to be specific with the required action, example; new child, remove child, name error, etc. so if I concatenate the first name with the parent's identifier it will not be specific enough, it could be a spelling error, new child, etc.
Any ideas?