Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Archived Training

Parsing with Regular Expressions

JoeM
Alteryx Alumni (Retired)

 

Sometimes using the text-to-columns or the formula is not enough. It's high time for a skills upgrade. With a working knowledge of regular expressions, you'll soon tame the most wildly cluttered data.

You’ll Learn How to:
Parse Data

Featured Tools:
RegEx

 

Presenter: @AlexKo

Date: 2017-03-27

Alteryx Designer Version: 11.0

4 REPLIES 4
MarqueeCrew
20 - Arcturus
20 - Arcturus

@JoeM,

 

The link for the .ics calendar appointment gets you a meeting on 3/7 instead of 3/28.

 

I hope to be there and see what you're cooking up for us.

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
JoeM
Alteryx Alumni (Retired)

@MarqueeCrew

Thanks for noting that error. I updated yesterday morning so that the date should now be correct.

BPurcell2
9 - Comet

Great Session, really demystified RegEx.

 

Using the URL in the demo, http://www.energy.ca.gov/almanac/renewables_data/solar/, how could you create a workflow to pull all years in the year selection drop down (or perhaps 2010 - 2015).  Is that possible?

 

Thanks.

 

Capture.JPG

AlexKo
Alteryx Alumni (Retired)

Hi @BPurcell2,

 

Cool question. Yeah you can do that. This website uses PHP for its interactive elements. If you use that year selection menu, you might notice that the URL changes to...

http://www.energy.ca.gov/almanac/renewables_data/solar/index.php

 

But it doesn't change the URL based on the year, it just updates the tables on the webpage. So what's going on behind the scenes? You can click anywhere on the page and "View page source" to find out. Here's the part that's important.

 

<form style="margin-left:23px;" action="index.php" method="post">
	
		<select id="goYear" name="newYear">
		
			
<option value='2015'>Go to a Different Year</option><option value='2015'>2015</option><option value='2014'>2014</option><option value='2013'>2013</option><option value='2012'>2012</option><option value='2011'>2011</option><option value='2010'>2010</option><option value='2009'>2009</option><option value='2008'>2008</option><option value='2007'>2007</option><option value='2006'>2006</option><option value='2005'>2005</option><option value='2004'>2004</option><option value='2003'>2003</option><option value='2002'>2002</option><option value='2001'>2001</option>			
			</select>
			<input type='submit' value='Go' />
	
</form>

 

This is the drop-down list, which exists in a PHP form object. The form sends an API post request when the user hits the "Go" button. Since there is a name associated with this form, newYear, we can pass along that variable right in the URL to get back the years we want.

 

http://www.energy.ca.gov/almanac/renewables_data/solar/index.php?newYear=2014

 

In Alteryx, you could have a field with all of the URLs from the different years as rows, pass it through the Download tool, and get a field of DownloadData as your response.

 

Capture.JPG

Best part? You can just add a few lines to the Text Input tool in the workflow we built out today and get back all the data from these years, without having to change any RegEx.

 

Cheers,

Alex

Alex Koszycki
Program Manager, Community Platform