Want to get involved? We're always looking for ideas and content for Weekly Challenges.
SUBMIT YOUR IDEAFor the fourth challenge let’s look at parsing Dates from text strings. To view the previous challenge, click HERE.
A dataset contains a text field that has a date embedded within the text. The problem is that the date is represented a few different ways. For example:
The goal is to create a new Date/Time field populated with the dates contained within the text field. You will also need to standardize the dates so that they are all formatted the same.
We have listed this as an advanced exercise since parsing out the dates can be challenging depending on the technique you employ to do it. As always, we love to hear your comments. Have fun!
UPDATE 12/7/2015:
The solution has been uploaded
We updated this exercise with the solution. Be sure to check out this week's exercise.
Thanks Tara. Would it be possible for the analyst the develops the recommended solution to post a brief write up explaining their choices in solving the problem? There's always multiple ways to approach such a problem and I think it's extremely helpful for learning to understand they 'why' behind a solution.
For example, in the first RegEx tool node, 4 output fields are configured, which then get evaluated into a single output via a sequence in a single formula tool. As I examine the RegEx tool, I wonder why 4 outputs were needed? Does each solve for a particular format? Are they designed to be mutually exclusive, etc.
I think of there was just a brief 1-2 sentence explanation for the particular choices, it would greatly enhance understanding of the solution.
Hi @dataMack,
Like you said, there is always multiple ways to approach work flows in Alteryx. In the case above there was 4 different formats the dates could be in. Since I'm not an expert with RegEx wanted to keep each in its own column for reformatting downstream (thus the 4). If I was a RegEx master I'm sure there is a single statement to identify all the dates and extract them in a nice clean date-time format.
I will try to provide a bit more explanation on approaches in the future, but keep in mind that there are many ways to solve an exercise and we are only posting an example. There is always another way and more often than not, a better way.
Best Regards,
@GeneR
This is the first exercise I've downloaded and looked at so far (we just got our Alteryx licenses). I initially looked at the RegEx syntax and thought "my Lord, how am I ever going to unpack that so that I could ever hope to build one of those myself", and then I noticed the drop-down menu next to the Regular Expression text box. That is seriously wonderful...the fact that you can build a RegEx statement by just selecting the components you want in the order you want. I realize this won't be necessary once I familiarize myself with the syntax more, but having this tool for the novice user is gold. Alteryx is an amazing tool. That's all I wanted to say. :-)
-Mike
My perpective was there were variations for only two formats.
(\d{1,2}(\s|-)\D{3,}(\s|-)\d{2,4})
(\D{3}(\s)\d{1,2}.{1,2}\d{2,4})
My regex skills are lacking, but here goes: