Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Databricks Connection

Martyn
9 - Comet

We're just starting out in connecting to Azure based data and I would like to know if there is a way to connect to Databricks without having to generate a token (all documentation that I have found suggests that generating a token is inevitable - both on Alteryx and for the ODBC driver that you can download directly from Databricks)? Our company policy is currently that access is granted using Azure AD - and the administrators of the platform are very reluctant to use tokens.

4 REPLIES 4
sjdonofrio521
5 - Atom

I'm looking for a similar answer. 

apathetichell
19 - Altair

I do not believe this is natively supported by Databricks's ODBC driver. Your IT team can talk to Databricks - who can send them here: https://databricks-sdk-py.readthedocs.io/en/latest/oauth.html and tell them to connect via SDK/Python to the CLI and use SSO there. Integration services for Databricks are really designed for service account activity.

sjdonofrio521
5 - Atom

Thanks for this! I'm a bit confused on how I use this within Alteryx. I only see the standard input data tool that doesn't allow me to enter any type of code into it.

I already have SSO access to Databricks, I just need to find a way to integrate that SSO access to Alteryx access. You are correct that the native ODBC driver doesn't allow for that, so I'm looking for other ways to connect to Databricks or our data lake.

What you sent looks promising, I'm just not sure how/where to input that code into Alteryx to set up the connection. If you have any thoughts, I'm all ears. Appreciate any assistance here!

apathetichell
19 - Altair

it is not out of the box:

1) You will need to able to access your DB cluster via CLI. Your team will have to help you. This may require installing WSL (Linux for Windows). If this is too much work for your team - stop here.

2) You wil need to integrate however you are connecting with Databricks with Alteryx. This is done via either the run command tool (ie you are running your queries in Run Command via CLI) or via Python (maybe?). 

3) You would build your workflow around using/controlling run command vs input data. your workflow would create a script/run the script- and the process the script. It is not easy.

 

I believe that these requirements are the same as you would have for connecting Databricks to any other integration where the connection originates outside of Databricks without traversing the ODBC or JDBC (ie Tableau/Power BI/etc).  I'd recommend they look into creating a SQL warehouse and provide a token/PAT for it with limited data access. I do not believe you can have a succesful Databricks integration with these tools while maintaining a no PAT policy. If your team needs to maintain a strict no PAT policy and you are systemically important - perhaps you can have Databricks run jobs to place enriched data in a blob storage - and then have Snowflake query the data from the blob storage? Snowflake supports SSO via ODBC.

Polls
We’re dying to get your help in determining what the new profile picture frame should be this Halloween. Cast your vote and help us haunt the Community with the best spooky character.
Don’t ghost us—pick your favorite now!
Labels