Hi!
I am kind of stuck at the end of the tunnel here for a POC meant to streamline AWS S3 data loads.
The goal is to,
1) Parse and load files to AWS S3 into different buckets which will be queried through Athena
2) Create external tables in Athena from the workflow for the files
3) Load partitions by running a script dynamically to load partitions in the newly created Athena tables
So far, I was able to parse and load file to S3 and generate scripts that can be run on Athena to create tables and load partitions. The major issue now is that the Dynamic Input module which allows me to run Athena queries through a Simba Athena ODBC driver will not allow me to run any DDL operations.
Following is the error received...
Dynamic Input (3) Error SQLPrepare: [Simba][Athena] (1040) An error has been thrown from the AWS Athena client. Athena Error No: 130, HTTP Response Code: 400, Exception Name: InvalidRequestException, Error Message: line 1:30: extraneous input 'CREATE' expecting {'(', 'ADD', 'ALL', 'SOME', 'ANY', 'AT', 'NO', 'SUBSTRING', 'POSITION', 'TINYINT', 'SMALLINT', 'INTEGER', 'DATE', 'TIME', 'TIMESTAMP', 'INTERVAL', 'YEAR', 'MONTH', 'DAY', 'HOUR', 'MINUTE', 'SECOND', 'ZONE', 'FILTER', 'OVER', 'PARTITION', 'RANGE', 'ROWS', 'PRECEDING', 'FOLLOWING', 'CURRENT', 'ROW', 'SCHEMA', 'COMMENT', 'VIEW', 'REPLACE', 'GRANT', 'REVOKE', 'PRIVILEGES', 'PUBLIC', 'OPTION', 'EXPLAIN', 'ANALYZE', 'FORMAT', 'TYPE', 'TEXT', 'GRAPHVIZ', 'LOGICAL', 'DISTRIBUTED', 'VALIDATE', 'SHOW', 'TABLES', 'VIEWS', 'SCHEMAS', 'CATALOGS', 'COLUMNS', 'COLUMN', 'USE', 'PARTITIONS', 'FUNCTIONS', 'TO', 'SYSTEM', 'BERNOULLI', 'POISSONIZED', 'TABLESAMPLE', 'UNNEST', 'ARRAY', 'MAP', 'SET', 'RESET', 'SESSION', 'DATA', 'START', 'TRANSACTION', 'COMMIT', 'ROLLBACK', 'WORK', 'ISOLATION', 'LEVEL', 'SERI
Here is the query syntax I have that works fine in Athena but not through the Dynamic Input in Alteryx.
CREATE EXTERNAL TABLE IF NOT EXISTS athenadbname.athenatblname (col_one string,col_two string,col_three string) PARTITIONED BY (date string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://bucket/athenatblname' TBLPROPERTIES ('parquet.compress'='gzip')
Any help is much appreciated!
Thank you.