Use Alteryx Generic OAuth feature to securely authenticate to Databricks on AWS
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Alteryx Designer 2023.2 release introduces Generic OAuth2 authentication method support for a number of data sources. This authentication method, allows our users leverage OAuth2-based authentication to provide seamless authentication experience without comprising on security. With this feature, our users can use an identity provider of their choice, integrate it with the data source and use IdP issued identities to setup connections from Alteryx Designer. Complete list of data sources supporting this authentication method can be found in Designer 2023.2 version release notes.
This post demonstrates how Generic OAuth authentication method can be used to setup OAuth-based authentication for Databricks running on AWS.
Please note, the following example is intended for demonstration purposes only. We recommend engaging your systems team to help you with the configuration. This example cover User-to-Machine OAuth.
This is also known as User-to-machine (U2M) authentication.
U2M interactions in Databricks DBSQL API involve users working directly with the API to perform tasks such as executing SQL queries, managing clusters, and creating or modifying databases and tables.
To access Databricks data on AWS using Generic OAuth authentication, users are required to have the following configuration in place:
- Register an OAuth application in Databricks Account on AWS;
- Obtain authentication details required to setup a new connection between Alteryx Designer and Databricks;
- Setup a new connection with Databricks;
If you don’t know where to find these details, don’t worry as below we provide a step-by-step instructions.
Register an OAuth application in Databricks Account on AWS
curl -X POST https://accounts.cloud.databricks.com/api/2.0/accounts/<Databricks account ID>/oauth2/enrollment
\ --header "Authorization: Bearer $OAUTH_TOKEN"
This verify OAuth has been enabled for your Databricks instance run the following request
curl -X GET
https://accounts.cloud.databricks.com/api/2.0/accounts/<Databricks account
ID>/oauth2/enrollment \
--header "Authorization: Bearer $OAUTH_TOKEN"
This request shall return the following response
{"is_enabled":true}
curl -X POST -d '{ "redirect_urls" : [ "<Redirect URL>" ], "confidential" :
true|false, "name" : "<Name>" }'
https://accounts.cloud.databricks.com/api/2.0/accounts/<AccountID>/oauth2/c
ustom-app-integrations --header "Authorization: Bearer $OAUTH_TOKEN"
- redirect_urls - ['http://localhost:5000'];
- confidential - ‘true’ if you need OAuth client secret, otherwise set this value to ‘false’;;
- name - name of your OAuth client;
- scopes - supported scopes;
- AccountID - Databricks accountID
{
"redirect_urls":["http://localhost:5000"],
"confidential":true,
"name":"oauth client2",
"scopes":"all-apis"
}
Obtain authentication details required to setup a new connection between Alteryx Designer and Databricks
To setup a new connection to Databricks with Generic OAuth2 authentication method, you need to collect the following detais:
- Authentication endpoint
- Token endpoint
- scope
- client_id
- client_secret (optional)
In previous step, we’ve already collected client_id and client_secret. In this step, let’s collect authentication and token endpoint details. To find your instance authentication and token details, you can use Databricks OpenId .well-known endpoint.
Make a GET request to the following endpoint, replacing databricks host value with your Databricks host.
https://{databricks-host}/oidc/.well-known/openid-configuration
In response, you should get a json structure listing your authentication and token endpoints. For example:
{
"authorization_endpoint":"https:\/\/dbc-938d9d6e-0fb3.cloud.databricks.com\/oidc\/v1\/authorize",
"token_endpoint":"https:\/\/dbc-938d9d6e-0fb3.cloud.databricks.com\/oidc\/v1\/token",
"issuer":"https:\/\/dbc-938d9d6e-0fb3.cloud.databricks.com\/oidc",
"jwks_uri":"https:\/\/accounts.cloud.databricks.com\/oidc\/jwks.json",
"scopes_supported":["offline_access","all-apis"],
"response_types_supported":["code","token"],
"response_modes_supported":["query","fragment"],
"grant_types_supported":["client_credentials","authorization_code","refresh_token"],
"code_challenge_methods_supported":["S256"],
"token_endpoint_auth_methods_supported":["client_secret_basic","client_secret_post","none"]
}
Collect and store values of authorization and token endpoints. Please also note the scopes enabled for this application. We’ll need both “offline access“ and “all-apis” for this tutorial.
Let’s recap, by now we’ve enabled OAuth authentication for our Databricks instance, created a new OAuth client and collected required details. Let’s move on and finally connect to our Databricks instance.
To access data in Databricks from Alteryx Designer, add input or output tool, check “Use Data Connection Manager (DCM)” box and select Databricks or Databricks Unity Catalog from the list of available data sources in Alteryx Designer. Select Quick Connect, provide your Databricks instance details.
Next, click on create new credentials. Provide a name for this credentials, and specify the scopes we collected earlier followed by the space. E.g. “all-apis offline_access”
- Labels:
- Connectors
- Database Connection
- In Database
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
This was incredibly helpful, thank you for pulling it together.
The Generic OAuth Configuration screen has changed a bit (I am using 2024.2), and instead of asking for the oauth redirect port it is now asking for a URI. I tried using https:localhost:5000 but kept getting a validadtion error, so when I checked the dcmschema definition files I realized that the validation regex is only looking for http, so just a warning for anyone else trying this, make sure that your URI is HTTP not HTTPS.
The other issue I ran into was the scope when registering the OAuth client. I initially used just "all-apis", but once I was trying to connect I would get an error about the scopes and I ended up needing to do a PATCH request to update the configuration to include
"scopes": [
"all-apis",
"offline_access",
"sql",
"email",
"openid",
"profile"
]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @bkclaw113 , thanks for your comment. A few notes for future readers:
- The redirect uri in Designer shall always conform to http, not https protocol;
- The need to perform raw api call to configure/enable Oauth client in Databricks is no longer required - there is a new UI offered by Databricks now that allows for that;
- Scope: the required scopes depend on the type of operations you're looking to perform. Our recommendation is: a. all_apis - allow interactions with all Databricks apis your identity is entitled to; b. offline_access - tells the Databricks client to issue refresh_tokens, which are used to obtain new access tokens.
We're currently working on a newer version of this post that shall be available at the below link later this month:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- The redirect uri in Designer shall always conform to http, not https protocol;
Why? This is a specific choice by Alteryx right? It's not an Oauth restriction or a stylistic standard for OAUTH redirect URIs. 'Always' seems pretty strongly worded. perhaps 'currently' would be a better fit?
