This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Depending what the objective here is then I may suggest a different approach.
If it’s for learning then sure this can probably be achieved completely with Alteryx.
If it’s for an actual project I’d probably consider grabbing the stadium list from wiki using a good old copy paste, and then use one of the many geocoding apps available on the Alteryx gallery to perform the geocoding, taking the stadium names and returning the appropriate latitude and longitude. If you search google maps on the Alteryx gallery there are a wide array of options available.
Although you could do this without REGEX, it is your friend here!!
Tokenize(to rows) the DownloadData Field at <table(.*?)</table> and then again (to rows) for <tr(.*?)</tr> and then (to columns) <td(.*?)</td>
The second column will give you the links in their tags, so you will need to parse that column with <a href="(.*?)".*?>(.*?)</a>
A formula tool to construct the full URL from the parsed data, and then feed that back to the Download Tool.
The image below gives an idea of the process, I've collapsed the container as it would be confusing... I just put together a quick set of tools to get the pic, not necessarily working.
You now have the HTML of all the stadiums pages... I haven't looked at those, but look for the co-ordinates and REGEX is the easiest way to pull those out... they should be pretty standard across the pages.