This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
We've recently made an accessibility improvement to the community and therefore posts without any content are no longer allowed. Please use the spoiler feature or add a short message in the message body in order to submit your weekly challenge.
1) Lowercase and ReplaceCharacter formula: lowercase(replacechar([lyric],'",.:;?!()',"")): I used Data Cleansing tool at first to remove leading whitespace, punctuation and modify upper case but did not work exactly as expected. I assume the discrepancy is from single quote punctuation getting removed by Data Cleansing tool but not by formula. 2) Text to Column: learned using Delimiter \s to represent space and split to rows.I used Reg Ex to tokenize entire words to rows but as a result some abbreviated words containing single quotes, such as "that's", ended up splitting into two words, "that" and "s". So when counting the top words the result included single letter words like "s", "ll", etc. 3) Summarize: count distinct -useful when counting unique values. The Unique tool returned a different result. 4) Join: left: lyric and right stop_words: this is so smart because the left output shows all lyrics without stop words which is what we want and can proceed to next steps, joined output shows lyrics matching stop words and right output shows stop words "leftovers" that don't match any lyric so these are the data we don't need. 5) Again: Sort -- two sorting orders and using dictionary order; Sample -- first n rows and grouping by column; Summarize -- concatenate string. add space in concat_string -- replacechar([Concat_lyric],","," ").