Hello everyone,
I need some help cleaning a dataset.
My input file contains customer IDs, records ids, stages and timestamps.
all of these recordIDs are set as Main although they should not be.
Each Customer may have two or more record ids, but only one of them should be set as Main.
the Main record is the one that has the most advanced Stage.
and in case both records have the same stage, then the one to be main is the most recent one.
Here is a sample of the dataset to clean up.
Your help would be greatly appreciated.
Many thanks
Esme
Hi @Christina_H
thank you for your help.
I may have not clearly specified my case, but there must be only one Main=true per Record and per Customer
all the other Records linked to a CustomerID should be set to false.
the record that should be set as Main is the one with the highest Stage#.
In the output of your flow, I can see a record appearing twice and both are set as Main (ex: AdRWYAA3)
and some other records are all set as false (ex: GEpHIAA1)
thank you for your help
Esme
@Esmeralda_In_Paris OK, I've tweaked the sample and join tools so that there is one main per customer and record. If a customer only has one record ID there should only be one main for them. If they have two record IDs there will be one main for each. Is that what you're after?