Want to get involved? We're always looking for ideas and content for Weekly Challenges.
SUBMIT YOUR IDEAHere's my solution. Slightly different results since I did more data cleansing - the provided solution didn't standardise the case. There's probably more that could be done to increase the accuracy too.
Solved using:
1. Data Cleansing
2. RegEx
3. Filter
4. Summary
5. Short
6. Append Fields
7. Formula
SOLVED - Extending Field Length from 254 to 2540000 -> Data Cleanse of Leading and Trailing Whitespace -> RegEx to Parse Field_1 and convert \<w+\> with a Tokenized Output splitting into rows -> Filter Field_1 is not null -> Summarize: Field_1 Group By Field_1 and Field_1 Count Count (Count Descending) and then also Summarize: Count Sum Sum_Count -> APPEND both Summarize Output > FORMULA: Percentage = ([Count]/[Sum_Count])*100
Also, explored the FREQUENCY TABLE function where it automatically creates columns for Frequency, Percent, Cumulative Frequency and Cumulative Percent
well, well, well..... RegEx and me, still not a very close relationship to be honest.
I studied @RolandSchubert's solution to get ideas and inspiration on how to best tackle such a task.
As always I learned something new - that was the goal of the challenge, right?