I have a data input file with over 10000 records. The file contains a field called Company. Some records contains many variations for one of the companies in the data set which is called Washington LLP (i.e. Washington LLP Co. Wash LLP, Washington Company) . I will need to standardize all of these values to one consistent name: Washington LLP
If I use the Find Replace tool, I will need to create a reference file with all of the possible variations for the company name. This will be time consuming given the amount of variations. Can anyone recommend a more efficient way of searching for an replacing all of the values?
Thanks,
Jenifer
This will take some work but I would start by using the FILTER tool on the "Company" field values to generate a list of names that begin with WAS or WASH.
Then I would apply the UNIQUE tool to crate a list of the unique values of all potential candidates. That should give you a manageable list of values you can scan manually to trim out the values that clearly do not belong.
Then you can use this curated list with the FIND REPLACE tool.
As a precaution I would copy the original name values into a separate column before making these changes, just in case you need to trace back, as that original value might also be linked to other data sets.