Hi Community,
We have created a flow and recipe in Dataprep.
We run our flows via GCP Composer and also directly in Dataprep.
Our flow Output Running environment parameter is set to Dataflow + BigQuery.
When running the flow directly from Dataprep (UI), the selected running env is BigQuery and the job succeeds.
Nevertheless, when running the same flow via Composer (Orchestrator) calling Dataprep via DAG, the running environment is DataFlow.
When Dataflow, the job fails with a schema error
Hi @Auggy,
What happens if you disable optimization [for the flow] in Dataprep, thus forcing it to run strictly on Dataflow there as well? Are the results consistent with the Composer (Orchestrator) calling Dataprep via DAG? If so, then I would infer that...
Let us know how it goes.
It sounds like Dataprep is working as expected, so it might be worthwhile to start a discussion with GCP support in parallel.
Cheers,
Nathanael