Given below is the data:
aaaa31/8/2025aaa
aaaa31/08/2025aaa
aaaDec/31/21aa12a
aaa1/1/21aa12a
aaadec-31-2021aaa
aaa31-Dec-20aaa
abc xyz pqr 18SEP52
abc xyz pqr BP0006T 22JUN26
aaa 18-SEP-52a222aaaa
Sep-18-52 Fx0006KT
General points to note:
As one can notice above, year will always be the in the right most position & will either be in 2 digits or 4 digits. Eg: 31-Dec-2021 or 31-Dec-21
Day or month both can be in the left most position: Eg: 31/Dec/2021 or Dec/31/2021
Month can either be in numeric (single or 2 digits) or 3 letter word. Eg: January can be as - jan or 01 or 1
Day can be in single or 2 digits
Delimiter can be / or - or nothing . Eg: 31/12/2021 or 31-12-21 or 12dec2021
I want a regex expression that is able to extract the dates that are available in different formats as shown above and is able to cover different possibilities as specified in general points above
Regards,
Chirag
Solved! Go to Solution.
Find in attachement the regex expression to do that.
Let me know if there is any issue and do not hesitate to accept this answer as solution if it helped.
Bonus: I also parsed the data in date format. 😉
@Emmanuel_G This solves a similar problem that I have with cleaning 50 different files all submitted with various date formats, some times in the same file. I listed all the formats in an excel file that helped build excel formulas. Now I can migrate the process into Alteryx!
@michaelsammons check this out!
Yes you can.
Since you have all the formats, it should be easy.
Don't hesitate to open a ticket on community if you want us to help you with extraction or parsing. 🙂
@Emmanuel_G
I tried to download this, but it says that "A Regular Expression was not specified" for the RegEx tool. How should it look?
Here is it
((?:\d+\/\d+\/\d+)|(?:[[:alpha:]]{3}\/\d+\/\d+)|(?:[[:alpha:]]{3}-\d+-\d+)|(?:\d+-[[:alpha:]]{3}-\d+)|(?:\d{2}\w+\d{2}))
Thank you for your help @Emmanuel_G !
User | Count |
---|---|
19 | |
15 | |
13 | |
9 | |
8 |