Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
After making the adjustment for the amount of characters allowed per line item, I was finally able to find the very simple solution. I am 100% not surprised that the most used word was "the".
Managed to get the exact match with a little research.
Final output seems incorrect as contains punctuation and Roman numerals, and some abbreviated words have been split at the apostrophe e.g. wasn and t as separate words.
close enough... seems like words are case sensitive, to "the" is counted separately from "The" and so forth. Also repulled the input data from the source text file to remove truncated lines.