Sometimes using the text-to-columns or the formula is not enough. It's high time for a skills upgrade. With a working knowledge of regular expressions, you'll soon tame the most wildly cluttered data.
You’ll Learn How to:
Parse Data
Featured Tools:
RegEx
Presenter: @AlexKo
Date: 2017-03-27
Alteryx Designer Version: 11.0
Thanks for noting that error. I updated yesterday morning so that the date should now be correct.
Great Session, really demystified RegEx.
Using the URL in the demo, http://www.energy.ca.gov/almanac/renewables_data/solar/, how could you create a workflow to pull all years in the year selection drop down (or perhaps 2010 - 2015). Is that possible?
Thanks.
Hi @Inactive User,
Cool question. Yeah you can do that. This website uses PHP for its interactive elements. If you use that year selection menu, you might notice that the URL changes to...
http://www.energy.ca.gov/almanac/renewables_data/solar/index.php
But it doesn't change the URL based on the year, it just updates the tables on the webpage. So what's going on behind the scenes? You can click anywhere on the page and "View page source" to find out. Here's the part that's important.
<form style="margin-left:23px;" action="index.php" method="post"> <select id="goYear" name="newYear"> <option value='2015'>Go to a Different Year</option><option value='2015'>2015</option><option value='2014'>2014</option><option value='2013'>2013</option><option value='2012'>2012</option><option value='2011'>2011</option><option value='2010'>2010</option><option value='2009'>2009</option><option value='2008'>2008</option><option value='2007'>2007</option><option value='2006'>2006</option><option value='2005'>2005</option><option value='2004'>2004</option><option value='2003'>2003</option><option value='2002'>2002</option><option value='2001'>2001</option> </select> <input type='submit' value='Go' /> </form>
This is the drop-down list, which exists in a PHP form object. The form sends an API post request when the user hits the "Go" button. Since there is a name associated with this form, newYear, we can pass along that variable right in the URL to get back the years we want.
http://www.energy.ca.gov/almanac/renewables_data/solar/index.php?newYear=2014
In Alteryx, you could have a field with all of the URLs from the different years as rows, pass it through the Download tool, and get a field of DownloadData as your response.
Best part? You can just add a few lines to the Text Input tool in the workflow we built out today and get back all the data from these years, without having to change any RegEx.
Cheers,
Alex