RegEx (again)
This time I want to extract Company Name, City, and State. My current solution uses a RegEx tool set to parse to 3 output fields using the following expression
(^.*?)\,.*?([[:upper:]].*?)\,.*?([[:upper:]].*?)\,.*? This expression is highly dependent upon the entries sticking to the pattern and using the commas to separate the desired fields.
Company Name, Company City, Company State, ipsum dolor sit amet, awarded a $1,999,999,999 consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor awarded a not to exceed $3,555,678 in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
It works until the source data does not follow its own pattern.
Here is an example of a problem entry. There are two issues. One, they inserted the name of a business unit between the company name and city. Second, they left off the comma after the state name so the expression does not know to stop or it only captures the New in New York. What can I do?
Company Name, Business Unit, Company City, Company State ipsum dolor sit amet, awarded a $1,999,999,999 consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor awarded a not to exceed $3,555,678 in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Thanks,