This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 06-22-201704:26 PM- edited
2 weeks ago
How To: Write to Hive Faster
Hive ODBC can be slow when loading data to tables. If you are looking for a faster option to write to Hive and want to create a new table or overwrite an existing table, use the In-DB tools to output your data.
Windows Operating System
A working ODBC DSN for Hive
Write access to HDFS directly
An In-DB connection is needed to be able to utilize this option. To create the connection:
1. Open the Manage In-DB Connections window either by going to Options > Advanced Options > Manage In-DB Connections or by dragging a Connect In-DB tool onto the canvas and selecting Manage Connections in the drop down.
2. Select Hive as the Data Source.
3. Click "New" to create a new connection or select an existing connection to edit
4. For a new connection, enter a Connection Name
5. On the Read tab, select the ODBC DSN to be used.
6. On the Write tab, select HDFS(Avro) as the Driver.
7. Click on the drop down and select New HDFS connection... to create the connection string or select an existing HDFS connections if any are displayed.
- It is helpful to first test the connection in a regular input tool to make sure it works
- The information to enter into the window can be obtained from the DB Admin/IT team
- The Temp Directory needs to be filled in for Alteryx to be able to write out a temporary avro file. By default the value is /temp. It can be changed to any directory the user has access to.