I am attempting to create an In-DB workflow, which will select columns, filter them and then summarize them. I am running against a Hadoop database which uses HiveQL. However, when I run my workflow, I get an error:
Error: Filter In-DB (12): Error SQLPrepare: [Cloudera][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. Error message from server: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:22 Invalid table alias or column reference 'fruit_example.fruit': (possible column names are: fruit, store, price, quality)
Workflow with Error
For some reason, when I am performing In-DB work, Alteryx is adding the tablename as a prefix to the column names. This causes the query to fail when it is sent back to the database.
The SELECT tool has the "Remove Prefix" option unavailable.
If I only perform a SELECT In-DB, the workflow will run, but if I add a filter, it errors. The query used in the Connect In-DB tool
This is sent from the Connect In-DB tool to the next tool in line.
I could do that; however, in the production version of this, there could be hundreds of possible columns that would need changed every time the workflow would be run. And, eventually, I would like to turn this into an app that will allow the user to choose the columns to be extracted.
Unfortunately, there is no option to do a Dynamic Rename In-Database, in the version of Alteryx used in my company (2018.3.4.51585). I have not seen that as being available in newer versions, but if it is, please let me know and I will advise out IT that we need to test.
For some reason, this is only happening when I run In-DB with Hive tables. When I tested it with Teradata, there was no issue.
The live data we are working with can run into billions of rows, so it needs to be filtered and aggregated prior to being extracted.
Thanks for the reply. I was able to identify the cause of the issue and a resolution.
All of my In-DB work uses a Hive ODBC driver. I had the most recent Hive driver (I think v2.6) pushed to my machine the same day I began encountering this issue. So I uninstalled the new driver, installed v2.1, and the issue was resolved.
I have found out that in the ODBC Data Source > go to you Hive configuration > Advance Option > tick the "Use Native Query", then it won't use the table_name.column_name syntax anymore... and will use just the column_name instead.