I have a dataset that has a bunch of date fields in the following format "DD-MMM-YYYY". It seems like I have to go thru a convoluted process to convert from a string to date. I dont understand why I can't change the type simply using the select configuration. Is anyone else annoyed with how alteryx handles dates? Seems like it should be easier.
Thanks,
For those who need to see the answer to the question first, you can use a formula like this or change [Field1] to [_CurrentField_] and use a Multi-Field Formula tool:
DateTimeParse([Field1],"%d-%b-%y")
How did I know the answer to this question? Was it because I am an ACE? Do I have a photographic memory? None of these are right. I knew the answer because I knew where to lookup the date formats (DateTime Format Help). Based upon this question and the presence of frustration, I have opted to write this blog and (as a present for those who read it) I have included an incubator macro (something that I'm playing with) that isn't generally available. My solution post (as of now) hasn't been starred or accepted as a solution, so I don't know if it helped Mike, but let's assume that it has. The community has literally 1,000 "Date" posts in the last year. While I could make a bundle of SOLVES if this continues, it does seem like there are an awful lot of questions regarding dates (note: not all are transformation questions).
Alternative approaches to working with dates include the BB Date tool and the DateTime tool. I've even created an idea (Alteryx making dating easier) that you can STAR to change the way that the product handles dates. Until dating becomes easier, here is a new approach to handling dates.
Date/Time format strings are used by the DateTimeParse and DateTimeFormat functions to tell Alteryx how to parse (read) or format (write) date/time values. Format strings are comprised of specifiers and separators.
Specifier | Output from DateTimeFormat | Supported Input with DateTimeParse |
---|---|---|
%a | Abbreviated weekday name ("Mon") | Any valid abbreviation of a day of the week ("mon", "Tues.", "Thur"), giving an error only if the text given is not a day of the week. Note that Alteryx does not check that the specified day name is valid for a particular date. |
%A | Full weekday name ("Monday") | Day name or any valid abbreviation of a day of the week ("mon", "Tues.", "Thur"), giving an error only if the text given is not a day of the week. Note that Alteryx does not check that the specified day name is valid for a particular date. |
%b | Abbreviated month name ("Sep") | Any valid abbreviation of a month name ("Sep", "SEPT."), giving an error only if the text given is not a name of a month. |
%B | Full month name ("September") | Month name or any valid abbreviation of a month name ("Sep", "SEPT."), giving an error only if the text given is not a name of a month. |
%d | Day of the month ("01") | One or two digits, ignoring spaces ("1" or "01") |
%e | Day of the month, leading 0 replaced by a space (" 1") | One or two digits, ignoring spaces ("1" or "01") |
%h | Same as %b ("Sep") | Any valid abbreviation of a month name ("Sep", "SEPT."), giving an error only if the text given is not a name of a month. |
%H | Hour in 24 hour clock, 00 to 23 | Up to two digits for hour, 0 to 23. Not compatible with %p or %P. |
%I (capital "eye") |
Hour in 12 hour clock, 01 to 12 | Up to two digits for hour, 1 to 12. Must follow with %p or %P. |
%j | The day of the year, from 001 to 365 (or 366 in leap years) | 3-digit day of the year, from 001 to 365 (or 366 in leap years) |
%k | 24 hours, leading zero is space, " 0" to "23" | Up to two digits for hour |
%l (lowercase "ell") |
12 hours, leading zero is space, " 1" to "12" | Not supported |
%M | Minutes, 00 to 59 | Up to two digits for minutes |
%m | Month number, 01 to 12 | One or two digit month number, 1 or 01 to 12 |
%p | "AM" or "PM" | Case blind ("aM" or "Pm"). Must follow %I (capital "eye", hour in 12-hour format) |
%P | "am" or "pm" | Case blind ("aM" or "Pm"). Must follow %I (capital "eye", hour in 12-hour format) |
%S | Seconds, 00 to 59 | Up to two digits for seconds |
%X | The 12-hour clock time, including AM or PM (“11:51:02 AM”) | Hours : Minutes : Seconds [AM / PM] |
%y | Last two digits of the year ("16") | Up to four digits are read, stopping at a separator or the end of the string, and mapped to a range of the current year minus 66 to current year plus 33. (For example, in 2016, that's 1950 to 2049.) |
%Y | All four digits of the year ("2016") | Two or four digits are read. Two digits are mapped to a range of the current year minus 66 to current year plus 33. (For example, in 2016, that's 1950 to 2049.) |
Please note that I didn't include the full help article here. I was even going to show fewer rows (e.g. remove time rows) but wanted everyone to get a good taste for the possibilities. The help article shows both how to convert strings to dates with DateTimeParse() as well as how to format dates back into strings. For simplicity, I am looking to help with the conversion to date fields here. So let's focus on the parse function.
The syntax is DateTimeParse([Fieldname],<Format>). This means that the format can either be a variable in the incoming data or it can be a string. If it is a string, then the formatted function would look like this for a sample start date field with a value of 23-Jul-1990:
DateTimeParse([Start Date], "%d-%b-%Y")
%d was chosen as the day of the month then '-' was the delimiter used followed by %b as the abbreviated month name followed by '-' followed by %Y as the 4-digit year. As long as all three components of the date are present in the data, you should be able to construct a formula that meets your specific needs.
For fun I created a macro that would sample the first 100,000 records from an input file and would find the most prevalent description of the incoming date formats (for non-null values). It can discern between 04/01/2018 as April 1st and January 4th (localization) if dates exist that make this clear (e.g. 04/13/2018) in that date field. If multiple formats exist for incoming dates, the Pct (percentage) column falls below 100%. Instead of showing each format, my thought was that you'll need to do some data investigation to better understand the source of the data. This "tool" macro can help you to understand the Parse function as a beginner and prepare you to better help others in the future.
What formats does it help with? Many formats are supported (the delimiter is any non-word character):
I didn't try to make all formats and I honestly didn't test all of these formats either. What I did do was create a "roughly right" macro and would ask you to test it out for me. See if it helps you and suggest ways to improve it. Take a look at the macro (open it) and see how I got the results. Here is a little logic for you:
REGEX_Match([Field1],"\l{3}\W+([1-9]|0[1-9]|1\d|2\d|3[01])\W+(19\d{2}|2\d{3})")
If this statement is true, then the format is: "%b-%d-%Y" where the - is whichever delimiter is really in your data.
I calculate a series of formats (e.g. b-d-YY) as boolean variables (True or False) and then convert the True values to expressions. I do a little magic and then pop out my guess for the format. Granted, I could actually transform the data and could allow you to do this for more than one field at a time. But alas, I'm not so inclined. I do want to help the Alteryx community. I welcome your feedback here and hope that if you are reading this post that it not only helps you to solve your current challenge but also gives you a forum to voice your opinions. Please feel free to download and to try the tool out. If there are problems here or if you think of ways to improve this tool, then I might publish a version as part of the CReW macros.
Cheers,
Mark
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.