I am looking at a list of companies and ranking them based on # of employees.
However, these companies have several different lines of business in various industries and may operate both nationally and internationally.
Luckily, even if the companies are listed several times in several different ways, I can connect them by a unique ID.
This is my starting point:
Ranking | Unique ID | Company Name | # Employees | National/International | Industry |
1 | 001 | Company A | 500,000 | National | Healthcare |
2 | 003 | Company C | 450,000 | International | Financial Services |
3 | 002 | Company B | 400,000 | National | Real Estate |
4 | 004 | Company D | 350,000 | National | Healthcare |
5 | 006 | Company F | 300,000 | National | Retail |
6 | 001 | Company A | 250,000 | International | Pharmacy |
7 | 004 | Company D | 200,000 | National | Biotech |
8 | 002 | Company B | 150,000 | International | Financial Services |
9 | 003 | Company C | 100,000 | International | Banking |
10 | 001 | Company A | 50,000 | National | Retail |
I intend to rank them by their primary (biggest) businesses, but also show their secondary businesses (and possibly third, etc.) in different columns, but all on the same row.
I have tried using the Summarize tool to concatenate the company names and other aspects, and group by unique ID. Right now, the # of employees is summed, but I would like it separated out for each line of business.
This is what the data currently looks like:
Ranking | Unique ID | Company Name | # Employees | National/International | Industry |
1 | 001 | Company A, Company A, Company A | 800,000 | National, International, National | Healthcare, Pharmacy, Retail |
2 | 003 | Company C, Company C | 550,000 | International, International | Financial Services, Banking |
2 | 002 | Company B, Company B | 550,000 | National, International | Real Estate, Financial Services |
2 | 004 | Company D, Company D | 550,000 | National, National | Healthcare, Biotech |
3 | 006 | Company F | 300,000 | National | Retail |
This is my desired end result:
Ranking | Unique ID | Company Name (Primary Business) | # Employees | National/International | Industry | Company Name (Secondary Business) | # Employees (Secondary Business) | National/International (Secondary Business) | Industry (Secondary Business) | Company Name (Tertiary Business) | # Employees (Tertiary Business) | National/International (Tertiary Business) | Industry (Tertiary Business) |
1 | 001 | Company A | 500,000 | National | Healthcare | Company A | 250,000 | International | Pharmacy | Company A | 50,000 | National | Retail |
2 | 003 | Company C | 450,000 | International | Financial Services | Company C | 100,000 | International | Banking | Null | Null | Null | Null |
3 | 002 | Company B | 400,000 | National | Real Estate | Company B | 150,000 | International | Financial Services | Null | Null | Null | Null |
4 | 004 | Company D | 350,000 | National | Healthcare | Company D | 200,000 | National | Biotech | Null | Null | Null | Null |
5 | 006 | Company F | 300,000 | National | Retail | Null | Null | Null | Null | Null | Null | Null | Null |
Please let me know the most efficient way to get here. Right now I am thinking of converting the # Employees column to a string (currently a double), concatenating it, then using Text to Columns to separate it out. However, I would need to do several columns at once.
Any/all advice is appreciated. Thank you!
@Shaskel one way of doing this
tile tool unique value - unique id. use formala tool to rename
[Name] as
if [Tile_Sequence_Number]>1 then [Name] + tostring([Tile_Sequence_Number]) else [Name] endif - feel free to add another formula tool with a switch command to switch 2 to secondary, 3 to third, etc. - whatever - that's on you.
filter tool -> get ride of the sequence numbers beyond the part you want.
now cross tab - primary keys are unique id/company name - name is your cross tab column - value is your value. concatenate or sum your values.
User | Count |
---|---|
16 | |
14 | |
11 | |
6 | |
6 |