Alteryx Server Discussions

Find answers, ask questions, and share expertise about Alteryx Server.

Alteryx Server DCM Google BigQuery read and write schedule

iansouza
5 - Atom

Hi there, I'm facing an error on Alteryx Server and want to know if some of you have found the same issue and how to overcome it.

 

The steps I did are as follows:

 

1 - Create a workflow that reads data from a table on Google BigQuery, using DCM Quick Connect to do it.

2 - Write data to a table on Google BigQuery, using DCM bulk to do it.

3 - Publish the workflow on server.

4 - Synchronize your credentials on Connection Manager to server.

5 - Share your workflow in a collection where a college has access.

6 - Share your credentials on DCM Data Sources with that college.

7 - Ask him to schedule your workflow.

 

When doing this, the workflow scheduled by my college ran once with no errors, but from the second atempt i got the following error.

Which translates to "Failed to complete write to Google BigQuery: Missing object."

image.png

 

 

 

 

 

 

 

I've made the tests and it only happend on this specific cenario.

I was following this link to make the connection possible What is Data Connection Manager (DCM)? 

 

My Alteryx server version is 2022.3.1.430

7 REPLIES 7
JMB001
8 - Asteroid

I'm having the same exact issue on 22.3. This is probably a bug.

 

Slightly different circumstance though:

 

1. Opened workflow from server Collection that is owned by a colleague with an existing schedule.

2. Changed output to Bulk - DCM with Connection that both myself and colleague have OAuth access to. Sync'd Connections successfully.

3. Saved workflow back to Collection. 

4. Shared Connection via Gallery > DCM Data Sources

5. Let the schedule run with the new workflow.

 

When I save to my own Workspace and do a Manual or Scheduled run it always succeeds, but not under the ownership of my colleague who owns the Workflow and the Schedule.

 

But given that the only Alteryx documentation says, "share the connection", this should work. 

 

So, I'm thinking the issue is on the Google Cloud Storage side, since it's looking for a missing "object", which is either a Python issue or the GCS storage object is not being allocated.

 

Can anyone else look into this?

 

JMB001
8 - Asteroid

Update. The Missing Object is likely one of the DCM Connection objects (the Data Source, Credentials, or both).

 

Anyone looking into this?

JMB001
8 - Asteroid

Alteryx 2023.1.1.281 Patch 3

 

We had a user that had a similar issue: 

  1. Workflow had 3 output tools
  2. All 3 used the same DCM Connection
  3. The DCM was Synchronized to the Gallery
  4. The Workflow was Scheduled
  5. The Scheduled Workflow job ran successfully and all 3 output tools outputted to BigQuery
  6. The Scheduled Workflow job failed after two days - but only on one of the output tools, while the other two ran successfully.

Error - "Missing Object" meaning the DCM tool couldn't find any Data Source or Data Credential object in the DCM.

 

This seems like erratic behavior from the Server side and this DCM technology.

JMB001
8 - Asteroid

 

This is a matrix of the scenarios we have test on our team. If someone could please verify our results with your own tests, that would be great.

 

Failures are "Missing Object" errors where its like the DCM tech of the user running the workflow can't find the DCM Connection Object that was Shared with it.

 

Conditions:

  1. All Workflows and Schedules shared in Collection
  2. All DCM Connections Synchronized to Alteryx Gallery
  3. All DCM Connections shared with other user

 

Workflow OwnerData Connection (DCM) Schedule OwnerManual Run  Scheduled Run 
   User 1User 2 User 1User 2
User 1User 1User 1YesNo Yes*No
User 2User 1User 1YesNo Yes *No
User 2User 2User 1NoYes NoYes
User 1User 2User 1NoYes Maybe **Yes
        
* May get failures on occasion       
**Has failed when creating a new schedule. Has failed on the second scheduled run after a successful first run.       
apathetichell
18 - Pollux

Is your data in a GCS bucket - or just in a BQ table? Does your Server have an underlying GCP identity (ie is a compute instance) or is it otherwise synched to a GCP project (ie can it be granted access to BQ via GCP IAM service account)? If so - I'd recommend skipping the BQ tool/connection and scripting this as a GCP CLI transfer to GCS - and let BQ internally do the heavy work.

 

JMB001
8 - Asteroid

Hey @apathetichell thanks for your comments!

 

Is your data in a GCS bucket - or just in a BQ table? The Google - DCM - Bulk uses the GCS bucket as the methodology. The error occurs before this is instantiated, I'm pretty sure, since it says, "Missing Object" which led me to believe it was the DCM Object(s) (see the documentation). GCS also uses "objects" but I don't think it would give the same error.

 

Does your Server have an underlying GCP identity (ie is a compute instance) or is it otherwise synched to a GCP project (ie can it be granted access to BQ via GCP IAM service account)? I don't know, I don't think so. And in these cases the Server would be using my User to run the workflow, rather than a Service account. Our org does have a Service Account to GCP but they don't let it out readily. So I think this avenue is a non-starter at this point.

 

If so - I'd recommend skipping the BQ tool/connection and scripting this as a GCP CLI transfer to GCS - and let BQ internally do the heavy work. What do you mean by scripting this as a GCP CLI transfer? Do you mean the whole workflow as a script in the GCP CLI? If so, that's not advantageous since we need to leverage Alteryx for the transformation layer, but again I'm not quite sure what you mean.

 

I'm open to any other ideas you have.

apathetichell
18 - Pollux

Hey - so in GCP a service account isn't exactly a service account in common parlance. A service account in GCP is a both an entity and IAM permission (ie you have permission to use a service account - which is mostly an internal construct but determines what you can access and what your level of access is). This is incredibly confusing if you use both GCP and AWS...

 

You auth into GCP - you auth via oauth  - right? (ie a browser window pops up and you click through) - this would not be recreatable on Server. access via JWT would be. If your InfoSec won't let you - your InfoSec won't let you. If your Server is really an EC2 or GPC Compute Instance it probably has a native role wich you can build up - and which your infosec may be happy with - because all access would be via role/federated identity (at least if they are like my infosec/cloudsec they'd be happier).

 

In my vision you'd avoid the sillinees of Alteryx's BQ interaction by scripting the last part (ie putting the data in BQ) - via a transfer to GCS and having BQ build the table off of GCS.

 

And yes - object is probably a DCM object or other internal object - not a GCS object.