Hi Everyone, need help with the below scenario
Scenario 1. Input is a detailed text string that will contain a specific ID which will be a numerical value and can range between 7-15 digits for this we are using the regex tokenize to parse the data
Scenario 2. some scenarios are where the field will contain 6 digits only, these need to review separately, however, we need numbers only
at times the field can contain alpanumeric characters example A123456 if i used tokenize 6-15 it picks up 123456 which it should not
if it's alphanumeric is should not parse, if its only numeric and 6-15 digits of numbers together is should parse.
@olimpio can you use data cleanse tool to say remove letters punctuation etc i.e only keeping numbers
@olimpio do you have sample data to test on so we can try put something together?
Without more examples its difficult to figure out what the strings look like. Try the following and see if that works.
(\b\d{7,15}\b|\b\d{6}\b)
Share some sample input and Output data
although as per your requirement i think data cleansing tool will be a easy solution for this
Input could be something like this.
scenario 1
1. working case abc, changes made to details of xyz submiited 1234567 [regex tokenize works here fine]
2. working case abc, changes made to details of pqr submiited 1234567 [regex tokenize works here fine]
scenario 2 - the last block of numbers need to be parsed out
2. working case abc, changes made to details of Abc submiited 123456
3.working case abc, changes made to details of EFG123456 submiited 123456