I have created an app which gets data from two source - the live IBM database & a flat file on shared drive. The reason is because database doesn't contain data from past which we are storing in flat file for the purpose of this process.
The user is prompted to select a period/month for which to pull the data in. This model works fine as long as data base and flat file don't have a common.
However if user select a period (let's say current month) which is present in both data source then my workflow contains duplicate value. In such a case, the workflow should automatically pick the data from database source and ignore any data from flat file. Any suggestions how to achieve that?
Solved! Go to Solution.
Hi @Akash2093
I would suggest joining the data on the key fields with the Live data on the left and the Flat data on the right. In the configuration of the Join tool select only the fields from the Left Input. After the join add a union that connects the J and the R outputs of the join. Any records in both the Live source and the flat file will be pushed out the J but since you're including only the Live fields, the join effectively filters out the matching records from the Flat data and leaves the unmatched Flat records on the R
Dan
Hi @Akash2093
I have attached below a workflow that will consider the date between the Live feed and the Flat file. If the datetime information overlaps, it considers the Live Feed data, if not it considers both. So then you will have the live data, the live overlapped data and the old data combined.
Please mark the topic as concluded with an answer to the post.
Pedro.
Hi Pedrodrfaria,
We need to give a condition which i am not sure about. But there is an another way also. If you add source file name in the input tool and later on delete the duplicates manually.
Thanks for your replies, however in these responses we are somewhere relaying on comparing data between two source which in my case won't work. The data in both source may be same or totally different, the only pin-point is that if the data (whether duplicate or distinct) is coming from both streams then it should switch to live stream.
Hi @Akash2093
Exactly how do you determine if the data is coming from both stream? There must be a set of key fields that you use to compare the records. Otherwise, how can you tell if the data is both streams?
Can you describe, preferably with an example, how you would compare the records from the two sources to determine if data is in one or both streams
Dan
@danilang Fortunately, i was able to figure out something without looking for the key fields. I am attaching the solution app here.
In both the input, month 6 is common. So the solution picks user selected month from either of source as per availability except when user selects 6 then it picks data from live source only.
Thanks for all the answers, those helped me getting the strings.
User | Count |
---|---|
17 | |
15 | |
15 | |
8 | |
5 |