Hello, I have a large dataset with records of users' industry fields and since there was no standardized data entry, the user could input freely in a textbox their industry, making it harder to track trends. What would be the easiest way to standardize this field whilst keeping most of the records. I have tried the fuzzy match, making a group, and find and replace tools but these have been inconsistent. Attached is a sample of the dataset I am working with. Thank you
Hi @R9wpr
The easiest way i can see of doing this would be to create a mapping file using a text input tool where you manually define what each industry should be grouped into, thus standardising it. You can use the Find/Replace tool to 'map' the standardised column into your dataset as well with the following setup:
Hi @R9wpr ,
You can standardize value of each industry with a number by using RecordID tool.
It will allow to have same naming convention for each industry based on numbers. it will also reduce the margin of error that we would have had with explicit text entries of industry names.
You can find a test with provided file.
Let us know if it works as you want.
Cheers !
User | Count |
---|---|
18 | |
14 | |
13 | |
9 | |
8 |