Hi everyone,
I have a dataset which contains 50 columns of data and 1M+ rows. I need to add field descriptions into the dataset, but i can't think of an efficient way of doing it.
Here is some example data. My data itself looks a bit like this:
NATIONALITY | PRODUCT | REGION | SATISFACTION |
TH | 01 | NE | 01 |
MO | 02 | N | 03 |
IT | 04 | E | 05 |
TH | 05 | C | 02 |
CH | 02 | W | 02 |
MO | 03 | S | 02 |
FR | 03 | NW | 05 |
GB | 05 | O | 04 |
IT | 04 | SW | 01 |
GB | 06 | SE | 03 |
FR | 06 | UNK | 04 |
CH | 01 | NE | 04 |
And my label lookup file looks like this:
FIELD_NAME | DATA | LABEL |
NATIONALITY | GB | GREAT BRITAIN |
NATIONALITY | MO | MOROCCO |
NATIONALITY | IT | ITALY |
NATIONALITY | FR | FRANCE |
NATIONALITY | TH | THAILAND |
NATIONALITY | CH | CHINA |
PRODUCT | 01 | BAR-Z |
PRODUCT | 02 | BAR-Y |
PRODUCT | 03 | LEVER-X |
PRODUCT | 04 | LEVER-A |
PRODUCT | 05 | LEVER-C |
PRODUCT | 06 | LEVER-G |
REGION | N | NORTH |
REGION | NW | NORTH WEST |
REGION | NE | NORTH EAST |
REGION | C | CENTRAL |
REGION | E | EAST |
REGION | W | WEST |
REGION | S | SOUTH |
REGION | SE | SOUTH EAST |
REGION | SW | SOUTH WEST |
REGION | O | OFFSHORE |
REGION | UNK | UNKNOWN |
SATISFACTION | 01 | VERY LOW |
SATISFACTION | 02 | LOW |
SATISFACTION | 03 | MEDIUM |
SATISFACTION | 04 | HIGH |
SATISFACTION | 05 | VERY HIGH |
And i'd like to end up with something that looks like this, the original dataset with label fields added in:
NATIONALITY | NATIONALITY_LABEL | PRODUCT | PRODUCT_LABEL | REGION | REGION_LABEL | SATISFACTION | SATISFACTION_LABEL |
TH | THAILAND | 01 | BAR-Z | NE | NORTH EAST | 01 | VERY LOW |
MO | MOROCCO | 02 | BAR-Y | N | NORTH | 03 | MEDIUM |
IT | ITALY | 04 | LEVER-A | E | EAST | 05 | VERY HIGH |
TH | THAILAND | 05 | LEVER-C | C | CENTRAL | 02 | LOW |
CH | CHINA | 02 | BAR-Y | W | WEST | 02 | LOW |
MO | MOROCCO | 03 | LEVER-X | S | SOUTH | 02 | LOW |
FR | FRANCE | 03 | LEVER-X | NW | NORTH WEST | 05 | VERY HIGH |
GB | GREAT BRITAIN | 05 | LEVER-C | O | OFFSHORE | 04 | HIGH |
IT | ITALY | 04 | LEVER-A | SW | SOUTH WEST | 01 | VERY LOW |
GB | GREAT BRITAIN | 06 | LEVER-G | SE | SOUTH EAST | 03 | MEDIUM |
FR | FRANCE | 06 | LEVER-G | UNK | UNKNOWN | 04 | HIGH |
CH | CHINA | 01 | BAR-Z | NE | NORTH EAST | 04 | HIGH |
Some of the data has a different label in different fields, e.g. "01" gets a different label in the "product" and "satisfaction" fields. The best solution i can currently think of is to create a separate label lookup file for each field name and then use joins to bring in the labels for each field.
I'm hoping that someone can provide a more efficient solution, and i'd be really grateful to anyone who can!
Thank you!
Matt
Solved! Go to Solution.
Hello @ms_315,
How about this:
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Regards
Thank you @afv2688 for the solution!