Databricks Connection
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
We're just starting out in connecting to Azure based data and I would like to know if there is a way to connect to Databricks without having to generate a token (all documentation that I have found suggests that generating a token is inevitable - both on Alteryx and for the ODBC driver that you can download directly from Databricks)? Our company policy is currently that access is granted using Azure AD - and the administrators of the platform are very reluctant to use tokens.
- Labels:
- Database Connection
- Input
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm looking for a similar answer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I do not believe this is natively supported by Databricks's ODBC driver. Your IT team can talk to Databricks - who can send them here: https://databricks-sdk-py.readthedocs.io/en/latest/oauth.html and tell them to connect via SDK/Python to the CLI and use SSO there. Integration services for Databricks are really designed for service account activity.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for this! I'm a bit confused on how I use this within Alteryx. I only see the standard input data tool that doesn't allow me to enter any type of code into it.
I already have SSO access to Databricks, I just need to find a way to integrate that SSO access to Alteryx access. You are correct that the native ODBC driver doesn't allow for that, so I'm looking for other ways to connect to Databricks or our data lake.
What you sent looks promising, I'm just not sure how/where to input that code into Alteryx to set up the connection. If you have any thoughts, I'm all ears. Appreciate any assistance here!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
it is not out of the box:
1) You will need to able to access your DB cluster via CLI. Your team will have to help you. This may require installing WSL (Linux for Windows). If this is too much work for your team - stop here.
2) You wil need to integrate however you are connecting with Databricks with Alteryx. This is done via either the run command tool (ie you are running your queries in Run Command via CLI) or via Python (maybe?).
3) You would build your workflow around using/controlling run command vs input data. your workflow would create a script/run the script- and the process the script. It is not easy.
I believe that these requirements are the same as you would have for connecting Databricks to any other integration where the connection originates outside of Databricks without traversing the ODBC or JDBC (ie Tableau/Power BI/etc). I'd recommend they look into creating a SQL warehouse and provide a token/PAT for it with limited data access. I do not believe you can have a succesful Databricks integration with these tools while maintaining a no PAT policy. If your team needs to maintain a strict no PAT policy and you are systemically important - perhaps you can have Databricks run jobs to place enriched data in a blob storage - and then have Snowflake query the data from the blob storage? Snowflake supports SSO via ODBC.
