Cloud Quest #27: Word Sleuth
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Community,
We posted the solution JSON file to Cloud Quest #26. Check it out and let us know what you think! Send suggestions to academy@alteryx.com or leave a comment below!
Let’s dive into this week's quest!
- Download and extract the provided ZIP file containing your starting data and workflow files.
- Upload the provided Cloud Quest 27 Start.json file into your Analytics Cloud library.
- Reconnect the provided Cloud Quest 27 Input.csv and Cloud Quest 27 Output.csv datasets to your starting workflow file.
For more detailed instructions on how to import and export Designer Cloud workflow files, check out the pinned article Cloud Quest Submission Process Update.
Scenario:
This week's Cloud Quest was inspired by a submission from Abubakar Mahmood (@BuQu). Thank you for the contribution!
The starting dataset for this quest is a TXT file that contains the transcript of The Project Gutenberg eBook of The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle. TXT files can be easily converted into CSV format for use in Designer Cloud.
Using the provided input, count the number of times each word appears in the text. Then, sort the words in descending order and calculate the percentage of each word's usage relative to the total word count.
Hint: You might notice a different number of output rows in your solution, as it largely depends on how specifically you define a valid word. In our solution, we excluded numbers, empty or null cells, and single-character cells (except for "a" and "I"). If you're unsure where to begin, start by tokenizing words into individual rows.
If you find yourself struggling with any of the tasks, feel free to explore these interactive lessons in Alteryx Academy for guidance:
- Getting Started with Designer Cloud
- Building Connections in Designer Cloud
- Building Your Workflow in Designer Cloud
Once you have completed your quest, go back to your Analytics Cloud library.
- Download your workflow solution file.
- Include your JSON file and a screenshot of your workflow as attachments to your comment.
Here’s to a successful quest!
Download Start File | Download Solution File
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Solved!
I had trouble getting my solution to match the provided solution. Some is because the solution counts empty cells as values, whereas I excluded them. But even adjusting for that, I'm still off a bit. I know it's going to come down to my cleansing/parsing being different ... or just a mistake :D
I was getting 1,094,890 rows, with nulls and empties removed
With only nulls removed, I get 1,164,765
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Carolyn @alexnajm – thank you both for highlighting the inconsistent output rows. You may find that your solution has a different number of output rows, as this depends on how strictly you define a valid word upstream. In the original solution, we applied minimal filtering of words before counting them, as the primary focus of the exercise was on parsing and calculating rather than dictionary-level accuracy for the words themselves.
To address this, we’ve updated the start file with a new Cloud Quest 27 Output.csv. This time, we’ve excluded numbers, empty or null cells, and single-character cells (with the exception of "a" and "I"). While this filtering is not exhaustive, it should align more closely with your workflow results.
Thanks again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@AYXAcademy I corrected mine and got it working 😊
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am still a bit off with the provided answer but I decide to leave with it. 😁
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Nice and quick challenge!
I wrote a blog post (Dutch) on Zipf's law back in 2017 which helps filter out these filler words
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Matched exactly.
