Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fix bad year (YYYY)

YeahMan
8 - Asteroid

Hi, playing with the trial version of Alteryx Designer and threw some particularly dirty, but actual, data at it. My first challenge is that some of the user-entered dates have typos. I've split the date into 3 columns so that I can clean up the year then re-glue it back together again. For example I have the following years:

 

3019

3017

...

2107

2106

2071

2027

 

Most of these are pretty clearly typos especially since these dates are historical (test scores). So my question - do I really have to write a fairly convoluted IF to clean these? Is there a CASE/SWITCH statement I can use to at least group them? Is there a better way?

Thanks. 

3 REPLIES 3
BrandonB
Alteryx
Alteryx

Using a filter tool you can say [date] <= datetimenow() so all of the historical records will come out of the True and future records will come out of the false. Typos can be tricky because there need to be ways that you can establish what was actually intended. For instance, if the first digit of the year starts with a 3, you may be able to infer that they always meant to put a 2. Similarly, if the number is 2071 they may have actually meant 2017. Where it gets tricky is if someone typed 2018 and actually meant 2017, it is impossible to know that was the case. 

 

So maybe as a start, you can split to the historical and future streams using a filter tool, then use a regex tool to split the years into three new columns using something like (\d{1})(\d{1})(\d{2}). Then you can use a formula tool to set the 3 in the first field to a 2 using a formula tool and then figure out rules for how you want to deal with other potential typos. 

years.png

DiganP
Alteryx Alumni (Retired)

@YeahMan Is only the first digit that's a type (3 -> 2) or sometimes all the 4 digits? How do you know which digits are wrong? 

 

You can use the formula tool or the regex tool to parse out the digits into 4 fields then add logic to fix the YYYY. 

 

Do you have a before > after for the dataset you shared?

Digan
Alteryx
YeahMan
8 - Asteroid

I ended up using a variation of your solution with a RegEx_Replace function. Thank you for your help.

Labels