This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
This is what I have:
Action | Companies |
A | Company 1; Company 2 LLC |
B | Company 2 inc |
C | Company one |
D | Company 1 international; Company 3; C2 |
E | Company 3 |
This is the outcome I want
Companies | Number of actions |
Company 1 | 2 |
Company 2 | 2 |
Company 3 | 2 |
C2 | 1 |
Problem 1: Find unique values from the column "companies"
Problem 2: Identify names that were spelled incorrectly/ in different ways, but actually belong to the same company
Firstly, you want to use the text to columns tool to make sure there is a single company in each row. You need to use ; as the delimiter and use 'Split to rows' mode, this will bring your data to look something like...
Action | Companies |
A | Company 1 |
A | Company 2 LLC |
B | Company 2 inc |
C | Company one |
D | Company 1 international |
D | Company 3 |
D | C2 |
E | Company 3 |
To get the count you can then use the summerize tool, grouping by the companies field and also performing a count against the companies field.
Now as you mentioned, there are issues with your company details, to resolve this there is a couple of approaches you could take, one automated, one a bit more manual; personally, I would apply a manual method to suppress common extensions like 'inc', 'LLC' and 'international'. This can be done via a mapping table and the 'find and replace' tool.
Alternatively, a more automated approach, would be through using the fuzzy matching tool; a good starter blog for this tool is here: https://www.theinformationlab.co.uk/2014/04/02/alteryx-tools-focus-fuzzy-match-make-group-unique/
Ben
To identify the same company name spelled differently, check out the Fuzzy Match tool
After you've matched similar company names, use the Summarize tool to count the actions.