Want to get involved? We're always looking for ideas and content for Weekly Challenges.
SUBMIT YOUR IDEAA solution to last week’s challenge can be found here.
To solve this week’s challenge, use Designer Desktop or Designer Cloud.
This challenge comes to us from @randall_king
Haven’t heard about Designer Cloud yet? Watch a demo.
https://www.alteryx.com/products/designer-cloud-trifacta
Do you often find that you have to clean data to get it in a readable and usable format?
In this weekly challenge, you need to clean data and organize it into a format that is more suitable for analysis.
The dataset contains information about the Olympic Games:
1. Participating Country
2. Country Code
3. Total Summer Games
4. Total Gold Medals Summer
5. Total Silver Medals Summer
6. Total Bronze Medals Summer
7. Total Medals Combined Summer
8. Total Winter Games
9. Total Gold Medals Winter
10. Total Silver Medals Winter
11. Total Bronze Medals Winter
12. Total Medals Combined Winter
13. Total Combined Games
14. Total Gold Medals Combined Games
15. Total Silver Medals Combined Games
16. Total Bronze Medals Combined Games
17. Total All Medals Combined Games
However, the data is all compiled in one field.
Your goal is to:
- Parse the data in columns.
- Clean out the special characters for the Country Code (for example, update “(AFG)” to “AFG”).
- Name the new columns (as shown above).
Data Source: www.kaggle.com/datasets/rushikeshlavate/olympic-games-medal-datasetfrom-1896-to-2018
Some nice RegEx to solve this 😊
Here's my approach