I have two input data documents will over 10,000 lines each of which is generated by a separate program and am trying to join the two by reference numbers. However there are slight differences in the reference numbers between the two systems due to how they are input. For example Program A might have FGRT102 and Program B might have FGRT102$ with the only difference being the $ at the end. To add to the difficulty, some reference numbers are 9 digits while others might be only 5 or up to 12. Is there any way to join these?
would it make sense to try to just match the first 5 or 6 characters from each?
Also if Program A is always included in Program B there are some ways you can do a wildcard non-join match... You can have both in a macro where he Program A is the macro Input and the B is a static input. It can then search if value A+wildcard is is Program B Alias....
Unfortunately due to the varying length and the fact that there may be 10 or 15 lines all of which have the same first few digits that is not an option here.
Is that something you could show me? I am pretty new to this and would have no idea how to set that up.
Is that something that you could show me an example of?
Post some sample data and an example of the output you would want. You also kind of need an arbitrary number of characters where you deem the ID the same... The way how your sample data is it's impossible for me to know if there is some other entry with the first 7 or 8 or 10 characters which are identical. One datasource has to have a base id- no?
See below for the sample data, I need a way to join the two data sets based on the alias, I could assume that the aliases in Program A would be correct so I need to see if there is a way to do that. The issue is that the data below is representative of the string lengths I work with and I cannot trim them down to a set number of characters as some items like the AHKHKI are sequential and trimming down the characters would eliminate those.
Program A Alias | Program B Alias |
TTTS-345623 | TTTS-345623$ |
NUSCGS005431 | NUSCGS005431EXT |
P-234571 | P-234571A |
AHKHKI115694 | AHKHKI115694ZZ |
CPCS-032480 | CPCS-032480! |
1000054274 | 1000054274X |
No, generally not. There is typically only one trailing.
User | Count |
---|---|
107 | |
82 | |
69 | |
54 | |
40 |