I have found some odd behavior on the encoding of dates in an Avro format I was hoping someone might be able to shed some light on.
I am connecting to SQL Server, say Adventureworks db, and grabbing a table, say DimEmployee, which has dates on it (dates, not datetime). Alteryx grabs the data just fine and it shows up as a date. I can then output this to an Avro file, either locally or on Hadoop, just fine. My issue is in the encoding process Alteryx converts date to a string. You can see this by looking at the schema at the header of the avro file. for example:
...
{
"name": "HireDate",
"type": [
"null",
"string"
]
},
...
Avro does support dates but it has to be encoded differently. Their documentation shows how here:
https://avro.apache.org/docs/current/spec.html#Date
Why does Alteryx force dates to become strings? I can work around it once it's on HDFS but it's kind of a pain to take the extra step.