How do I set up the Google Big Query Bulk Write connection
Starting in Designer 22.3, there is a new functionality to allow bulk writes to Google Big Query. This article shows how to set these up using DCM and In-DB tools.
- Alteryx Designer
- Simba Big Query ODBC Driver
- Google Big Query credentials
- user authentication; Google username/password
- Confirmation Code ( will receive after login)
- Bucket Name (for BulkWrite)
1. Download and install the Simba ODBC driver for Google Big Query by logging into licenses.alteryx.com and go to Product Downloads->Drivers.image.png
2. Scroll down to Google Big Query in the list of drivers. Select the name, and then download the corresponding MSI file on the next screen; the current version is GBQ_Simba_ODBC_220.127.116.115
. Click on the filename to start the download. Double-click on the file to install the ODBC driver; accepting defaults or changing the installation location if desired. Click Install when finished. image.png
3. Create a System DSN for Google Big Query in the ODBC manager (64-bit). You will need to select User Auth for the OAuth mechanism and Sign In
with your Google account. You will have to select Allow
when presented with the "BigQuery Client Tools wants to access your Google Account" window. The last thing you will be presented with is a window with your Authorization code. You'll need to copy this code (there's a button to the right of the code) and paste it into the Confirmation Code section of the DSN Setup page. If you then click in the Refresh Token input, it will automatically fill in the Refresh Token.
Be sure and select the Catalog/Project and Dataset at the bottom of the form and Test
to make sure it all works, before selecting Ok to finish. image.png
4. Next, you’ll need to create your DCM connections. You’ll create one for Google Big Query inputs using the ODBC DSN you created in step 3. You’ll also create a separate one for the bulk write BIg Query outputs. You can create these either by selecting File->Manage Connections or by clicking on the dropdown menu next to the connection string:
5. For this example, I’m creating the DCMs from within a Connect-In-DB tool. I'll select the dropdown menu next to the Connection Name, and choose Manage Connections. image.png6. When the Connection Manager comes up, we'll create the Read/input DCM for Google Big Query. Select Data Source-> Google BigQuery, Connection Type = User, select New for the Connections, enter a Connection Name, and then select Setup Connection.image.png
7. Now when the Connection manager comes up, select Add Data Source. Then select Technology ->Google BigQuery ODBC with Simba.image.png image.png Enter a Data Source Name. Click in the ODBC DSN field to select the System DSN you created in step 2 from the list, then select Save.image.png
8. Next, we’ll add a data credential. Select Connect Credential, and then choose Sign in with Google image.png
9. Wait a few seconds until the Oauth Tokens sections appear. Select Create New Credential from the pulldown and type in a Credential name. Then select Create and Link. image.png
10. Now select Connect.image.png
11. This should bring you back to the Manage In-DB Connections page, and you should now see the Driver and Connection filled in:image.png
12. The last thing we’ll set up is the output/Write DCM connection; click on the Write tab, verify that the Google Big Query Bulk Loader driver is selected, and then click Setup Connection.image.png
13. You should see any created Google BQ Bulk Loader Data Sources (in this case it’ll be empty since we’re creating our first one). Select Add Data Source.image.png
14. Fill in the Data Source Name, Catalog/Project Name, Dataset, and Bucket Name for your data source, and select Save. This information should be available from your Google administrator (and was used to create the System DSN in step 3). You may have to have them create a bucket for you if there isn’t one already available. It is required for staging the data when writing to Big Query and cannot be left blank.image.png
15. Select Connect Credential and then select the Authentication Method (Sign-in with Google) and Oauth Tokens Credential (Create New Credential), enter a Credential Name, and then select Create and Link.image.png
16. And then select Connect again.
17. You should land back on the Manage In-DB Connections page; the Connection section should be filled in with the Bulk Load credentials. You can select Apply and then Ok.image.png
18. You should now be presented with a Google login form. Select or fill in your account information and select Continue to acknowledge that you agree to give permission to Google Big Query.
19. When it returns, you should see the following if you've successfully logged in: image.png
20. If you're using a Connect-In DB tool it should display the Visual Query Builder at this point and allow you to select a table and build your query. When done, go to the SQL Editor tab and select Test Query
. When successful, select OK. You can now run the workflow and should see data coming back from your Big Query table(s).
21. This last example uses 2 input data sources; a csv file, and a SQL Server table. Each is written into separate temp tables in Google BQ (Data Stream In Tool) then unioned together and written to a new BQ table. The DCMs will be the same for both Data Stream-In tools. The Write Data Out tool uses the connection info from the Data Stream in tool (set up similar to steps 17-20 above), but also takes a table name and output mode (e.g. Overwrite Table (drop) or Create New Table) to determine where to write the data. image.png
Typically, the types of errors you see will be a result of invalid credentials or leaving the bucket name blank in the Bulk Write credentials in step 14. Be sure and have all of your credentials available when creating the System DSN and Bulk Write DCM.