I am sorry for the rather long post – but I had a bit of a struggle getting my thoughts across on pen and paper! Please grab a coffee as you read this!
I need some help in understanding how Alteryx “read” and “assigns” date data from Excel files. We work a lot with dates and getting them read correctly into Alteryx is key to our analysis. Throughout this post by “read’ I mean placing an Input Data Tool onto the canvas, and selecting the Excel file.
I have noticed a couple of things when I “read” Excel files containing dates into Alteryx.
For example, an Excel file has dates formatted and inputted as dd/mm/yyyy.
On reading this Excel file into Alteryx the dates in the Input Data Tool’s preview window automatically appear in ISO format yyyy-mm-dd (original Excel date format is lost) whether or not the box in the configuration of Input Data tool - "Box 6: First Row Contains Data" has been ticked.
Additionally, in the Select Tool sometimes even though Alteryx previews the dates in the ISO format the field itself has not been automatically assigned as a date data type (it may still be V_String data type when seen on Select tool).
Please see attached pdf (Snapshot_date_inputs_data_assignments.pdf) showing snap shots of above points within a workflow.
My first question is why does Alteryx automatically preview a date field from Excel in ISO format? There are times when I would like the original Excel date format preserved when reading the file into Alteryx and analyzing it in a workflow. Also at times there may be a variety of date formats within a date field and I would like all of them preserved for further analysis in Alteryx. But if Alteryx automatically switches them to ISO format as soon as they are inputted, I lose the ability to conduct further analysis.
(Side note – sometimes I have noticed that a date field has not been read into ISO format but is still in the original Excel date format. So, there is no consistency on how Alteryx reads date fields from Excel. Or perhaps there is and I may not be understanding various external or internal factors at play?)
I had asked this automatic date conversion question in an earlier post (Stop-automatic-DateTime-conversion-upon-input) but I would now like to go deeper and understand how can I preserve the original Excel date format(s) (note there may be a variety of date formats in a given date field/column of my Excel file) when the Excel file is passed to the Input Data tool. I would like to be able to prevent Alteryx from automatically reading the dates into the ISO format.
A second question is how or why does Alteryx determine when to automatically assign a date field as date data type. As can be seen in attached pdf, there were 2 date fields - one was correctly automatically assigned as a date data type while the other was assigned as V_String data type. What is the trigger for this?
I hope my ramblings have been clear as mud! Let me know if anyone out there has had same issues as me and what are the fixes to get Alteryx to be consistent when reading dates from Excel.
Liar paradox. In philosophy and logic, the classical liar paradox or liar's paradox is the statement of a liar who states that they are lying: for instance, declaring that "I am lying" or "everything I say is false". If they are indeed lying, they are telling the truth, which means they are lying.
When you view an excel file, you are looking at the data through a formatting lens. The underlying data is only stored in the WYSIWYG format if the data is preceded by a tick mark ('). As a sample I created a DOB field with 199 rows of dates followed by "Unknown". The dates were formatted as: MM/DD/YYYY and when read into Alteryx, the dates transformed to the ISO standrd of YYYY-MM-DD. Pretty magical indeed. The final row still is present as "Unknown". The data type for my DOB field is vstring.
Because Excel has control of the cell values it is going to be read via the driver as a date and Alteryx will ISO format that date. Now if I had saved the Excel file as a CSV and read the data, then the original formatting is preserved and I see string values presented via Alteryx (mm/dd/yyyy).
I'm sure that I didn't get through all of your questions, but I didn't have any coffee.
ACE & Alteryx Certified Partner Chaos reigns within. Repent, reflect and reboot. Order shall return.
The Liar's Paradox - that gave me quite a bit of food for thought on a Fri morn!
Thanks for the helpful insight on "how things are done" in Excel and the little side trick that you offer i.e. saving the Excel file as a CSV.
In most cases I can afford to save Excel files as a CSV. And when they are read into Alteryx - bang! - I get all the dates in the original Excel format (although all fields get read as V_Strings and it's a bit of work assigning them correct data types prior to my analysis - Auto Field does not always work..).
I think for my current task at hand (preserving Excel date formats) the CSV conversion of Excel files prior to getting them into Alteryx is a handy and neat trick. Though, not sure if this is all round solution i.e. will it always work?