This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
The latest release includes several enhancements designed to improve your Community experience!
I have an output file that contains more than 4M rows of data. This file is incorporated into Hadoop.
The problem I'm having is that the majority of the fields are formatted as String which automatically gets assigned a 32k field length as default.
Since this file is utiliazed as the source data for an external table I can't alter the field type within Hadoop. The table is used by Tableau and other SAS systems as well.
Is there a field type within Alteryx I should utilize so that Hadoop would recognize them as VarChar instead of String?
Are you using Alteryx to write to HDFS in a .csv format, and this file is used for the location of an External Hive Table?
Can you try to save the .csv to a new location and create a new external table, defining the string columns as varchar in your CREATE TABLE Hive syntax?
Andrew, yes, that what i'm doing. I create the table and set the field type to VarChar but remains as String.
On my output in Alteryx if I unselect 'Write BOM' in the options make any difference?
Can you share the following?
I just did a quick example and was able to get VarChars to work.
Here is my data from Alteryx I write to a hdfs csv. The file location is /DemoData_RW/ajkramer/campaign/test.csv. It is the only file in the campaign directory.
I then create an External Hive Table
CREATE EXTERNAL TABLE results_hive (age int, duration int, balance int, marital varchar(10), education varchar(10), y varchar(10), X_no double, X_yes double) row format delimited fields terminated by ',' LOCATION '/DemoData_RW/ajkramer/campaign' tblproperties ("skip.header.line.count"="1");
My variables defined as varchar are still varchar
Let me know what you are seeing on your end.