We posted the solution JSON file to Cloud Quest #6. Check it out and let us know what you think! Send suggestions to academy@alteryx.com or leave a comment below.
For more detailed instructions on how to import and export Designer Cloud workflow files, check out the pinned article Cloud Quest Submission Process Update.
Marvel Comics has a long history dating back to the 1930s and has created hundreds of characters we still find in superhero stories today. For today’s quest you will be blending and parsing data to identify every appearance of each character in Marvel Comics’ publication history (up to 2018).
Task 1: Use the provided datasets to determine the title, year, and issue number of every comic in which the characters in the Characters Text Input tool appear.
Task 2: Identify the first appearance of each character - year and issue number.
* Remove records without a publication year before sampling.
For those of you who are curious, titles without a published year are generally trade paperbacks and omnibuses that are collections of previously published comic issues. These are organized in narrative order rather than publication date and may include multiple titles contributing to the same storyline.
Hint: Use the RegEx tool to parse comic titles and publication years. Use a Summarize tool to group parsed data by Character Name, Comic Title, Publication Year, and Issue Number.*
*If you notice records with an issue number of -1, this is a numbering convention Marvel sometimes uses to indicate a prequel story. This will not affect your result.
A combination of the RegEx, Join, Summarize, Formula, Filter, and Sample tools should solve your problem, but not necessarily in this sequence
If you find yourself struggling with any of the tasks, feel free to explore these interactive lessons in the Maveryx Academy for guidance:
Once you have completed your quest, go back to your Analytics Cloud library.
Attached in my solution. Also, I noticed something unexpected when trying to debug my original logic. I initially made a mistake in my logic and tried to compare my result to the provided solution. When I attempted to join on the Character and Published year columns, the null values in the Published year column did not join together. I posted a more detailed write-up here: Unable to join on null values