Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

How to set up Google Big Query Bulk Write in Designer 22.3

jpowellsd
Alteryx
Alteryx
Created

How do I set up the Google Big Query Bulk Write connection


Starting in Designer 22.3, there is a new functionality to allow bulk writes to Google Big Query.  This article shows how to set these up using DCM and In-DB tools.
 

Prerequisites

 
  • Alteryx Designer
    • Version 2022.3
  • Simba Big Query ODBC Driver
    • GBQ_Simba_ODBC_2.4.6.1015
  • Google Big Query credentials
    • user authentication; Google username/password
    • Confirmation Code ( will receive after login)
    • Catalog/Project
    • Dataset
    • Bucket Name (for BulkWrite)
 
 

Procedure

 
1.   Download and install the Simba ODBC driver for Google Big Query by logging into licenses.alteryx.com and go to Product Downloads->Drivers.


image.pngimage.png

2.  Scroll down to Google Big Query in the list of drivers. Select the name, and then download the corresponding MSI file on the next screen; the current version is GBQ_Simba_ODBC_2.4.6.1015. Click on the filename to start the download. Double-click on the file to install the ODBC driver; accepting defaults or changing the installation location if desired. Click Install when finished.


              image.pngimage.png


 3.  Create a System DSN for Google Big Query in the ODBC manager (64-bit).  You will need to select User Auth for the OAuth mechanism and Sign In with your Google account. You will have to select Allow when presented with the "BigQuery Client Tools wants to access your Google Account" window. The last thing you will be presented with is a window with your Authorization code. You'll need to copy this code (there's a button to the right of the code) and paste it into the Confirmation Code section of the DSN Setup page. If you then click in the Refresh Token input,  it will automatically fill in the Refresh TokenBe sure and select the Catalog/Project and Dataset at the bottom of the form and Test to make sure it all works, before selecting Ok to finish. 


 image.pngimage.png

4. Next, you’ll need to create your DCM connections.  You’ll create one for Google Big Query inputs using the ODBC DSN you created in step 3. You’ll also create a separate one for the bulk write BIg Query outputs. You can create these either by selecting File->Manage Connections or by clicking on the dropdown menu next to the connection string:

5. For this example, I’m creating the DCMs from within a Connect-In-DB tool.  I'll select the dropdown menu next to the Connection Name, and choose Manage Connections.

                  
image.pngimage.png


6. When the Connection Manager comes up, we'll create the Read/input DCM for Google Big Query. Select Data Source-> Google BigQuery, Connection Type = User, select New for the Connections, enter a Connection Name, and then select  Setup Connection.


image.pngimage.png


7.  Now when the Connection manager comes up, select  Add Data Source.  Then select Technology ->Google BigQuery ODBC with Simba.


image.pngimage.png     

image.pngimage.png 


Enter a Data Source Name.    Click in the ODBC DSN field to select the System DSN you created in step 2 from the list, then select Save.


image.pngimage.png


8. Next, we’ll add a data credential.  Select Connect Credential, and then choose Sign in with Google


 image.pngimage.png


9. Wait a few seconds until the Oauth Tokens sections appear.  Select Create New Credential from the pulldown and type in a Credential name. Then select Create and Link.


 image.pngimage.png


10. Now select Connect.


image.pngimage.png

11. This should bring you back to the Manage In-DB Connections page, and you should now see the Driver and Connection filled in:


image.pngimage.png

12. The last thing we’ll set up is the output/Write DCM connection; click on the Write tab, verify that the Google Big Query Bulk Loader driver is selected, and then click Setup Connection.


image.pngimage.png


13. You should see any created Google BQ Bulk Loader Data Sources (in this case it’ll be empty since we’re creating our first one). Select Add Data Source.


image.pngimage.png


14. Fill in the Data Source Name, Catalog/Project Name, Dataset, and Bucket Name for your data source, and select Save.  This information should be available from your Google administrator (and was used to create the System DSN in step 3).  You may have to have them create a bucket for you if there isn’t one already available. It is required for staging the data when writing to Big Query and cannot be left blank.


image.pngimage.png


15. Select Connect Credential and then select the Authentication Method (Sign-in with Google) and Oauth Tokens Credential (Create New Credential), enter a Credential Name, and then select Create and Link.



image.pngimage.png


16.  And then select Connect again.

17. 
You should land back on the Manage In-DB Connections page; the  Connection section should be filled in with the Bulk Load credentials.  You can select Apply and then Ok.


image.pngimage.png


18. You should now be presented with a Google login form.  Select or fill in your account information and select Continue to acknowledge that you agree to give permission to Google Big Query.

19. When it returns,  you should see the following if you've successfully logged in: 



image.pngimage.png



20. If you're using a Connect-In DB tool it should display the Visual Query Builder at this point and allow you to select a table and build your query. When done, go to the SQL Editor tab and select Test Query.  When successful, select OK. You can now run the workflow and should see data coming back from your Big Query table(s).

21. This last example uses 2 input data sources; a csv file, and a SQL Server table. Each is written into separate temp tables in Google BQ (Data Stream In Tool) then unioned together and written to a new BQ table. The DCMs will be the same for both Data Stream-In tools. The Write Data Out tool uses the connection info from the Data Stream in tool (set up similar to steps 17-20 above), but also takes a table name and output mode (e.g. Overwrite Table (drop) or Create New Table) to determine where to write the data. 

                              image.pngimage.png


Common Issues


Typically, the types of errors you see will be a result of invalid credentials or leaving the bucket name blank in the Bulk Write credentials in step 14.  Be sure and have all of your credentials available when creating the System DSN and Bulk Write DCM.
 
 


Additional Resources

Comments
alyssa_williams
5 - Atom

Great info! If a user is wanting to publish this workflow to Gallery, can a File connection type be used, instead of user? So that the file connection can be embedded in the workflow? We've had success running workflows locally with the approach you've outlined, but we're unable to get the workflows to run successfully on Alteryx Gallery.

Danial_Nawab
7 - Meteor

Can we not establish a Google Big Query connection on Alteryx Server under the "Data Connections" section?

alyssa_williams
5 - Atom

As far as I've been able to tell, In-DB Google Big Query is only supported using DCM - you can't use the "Data Connections" Alteryx Gallery option.

JMB001
8 - Asteroid

Has anyone else received the  " Failed to complete write to Google BigQuery: Missing object." error when running workflow on the Gallery. It seems to be an error coming from Google Cloud Storage itself. But isn't clearly diagnosable.