When using an SQL Statement with a WITH Query Expression I am getting the following error: No select statement found. I was told that WITH statements are currently not supported at the moment.
Why this should be changed:
Best regards
Marcel
Details about the syntax:
https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax
Related customer questions
Hi,
I use the import by folder for GCS files to import many files in one time (present and futurs files droped in the same folder). Somtimes, the dataschema of files is not exactly the same for all the files but the columns names are always the same ! I'would like to use the "union by name" for the first union of the many files included in the folder that i've imported. With this function, if the dataschema change in the futur, my importation will be ok whatever !
We could have a screen like "recipe union screen" for the "import with union" (for the inports by folder) to select the columns to import and the type of matching for exemple...
This is a real issue for me because when the datascema of one file has changed, the scheduled RUNs are KO...
Sorry for my bad English, I'm French :-)
Thanks !
When you select a column for apply function or transformation, the methods to select columns are :
But this is not possible to select column with a "RegEx math" method on the name of the columns.
It would be much easier!
Why I must open all the recipes to reload each sample ?
For exemple :
I make flows with many recipes (between 60 and 100 - It's a real case for me).
On monday, I make a lot of modifications on "data cleaning" at the start of the data wrangling chain !
On tuesday, when i try to open others recipes, I've a warning message "your sample need to be updated" !!!
=> If I had a buttun "update all sample of the flow", I would run it on Monday before sleeping and Tuesday, i could work with smile !
PS : Sorry for my bad English, I'm a French user :-)
I often receive data sets which have rows above the column headers that I don't need. When importing the data set, there is a dropdown on the edit menu to "make the first row a column header". However, I would like for this dropdown to include an option to for example, "make row 20 the column header and delete all preceding rows". This would allow me to import the data already with column headers. When dealing with one dataset, I can always choose any row to make it the column headers, but when you have to join 20 similar datasets, it is not possible to do the same. Not sure if my idea is clear (lol), but it seems like it's something that could be easily incorporated into the tool. Thanks!
We would strongly like the ability to be able to edit datasets, created with custom SQL that have been shared with us. We think of Trifacta in part as a shared development space so if 1 users needs to make an update to a dataset but wasn't originally the owner - this slows down our workflow considerably.
The ability to apply various interpolation methods (cspline, linear, etc.) between sorted columns of integers.
Use a linked datasets created by GCP Analytic Hub as data source in DataPrep. Detailed informations in link below:
Can I use linked dataset (created by Analytic Hub in GCP) to build flows in DataPrep? (trifacta.com)
Case: 00027615 - created the case for our issue but came to know that functionality is not present
We had OAuth login issue when trying to set up with SNOWFLAKE as we use OKTA as our IDP for SNOWFLAKE.
We want our users to create their own SNOWFLAKE connector using their personal credentials through IDP which will enforce their role in SNOWFLAKE so they can see only the schema's which they are allowed to see.
We can not create generic connector because it will provide more data access then user needed and involve PII too so we want to utilize their snowflake functional roles to restrict it.
Its a really good use case for anyone using snowflake with IDP and have the RBAC set up with SNOWFLAKE.
Allow functionality in app for customizing support page, users to be able to contact our team when there is an issue with the application, page to show our email address, not Trifacta support email address
We need a custom viewer role so that user is able only to use connections shared to him, but not re-share those connections to others. In our case, admin will set up the connections for users and they will just use them. Users should not be able to create or share connections. This will improve the connection security and access to data.
It will be nice Trifacta to be able to export files in CDM format (Common Data Model) to ADLS gen2 so that they are fed automatically in PowerBI for reporting purposes
Please allow connections to be created from Trifacta to SharePoint online using SSO authentication, just like for Azure SQL/DWH.
Being able to Publish outputs directly to Google Sheets would be a major benefit for Sheets users.
It would be great if you can expand the metadata selection to not be limited to 2 elements (row number and file path) but could potentially add the date timestamp (e.g. $datecreated) to be used in the recipes.
We need the ability the create folders underneath the plans. We can create folders underneath flows, but not underneath plans. Additionally, having the ability to create sub folders inside of these parent flow and plan folders is needed. Hard to organize flows and plans without the ability to put them in categories (folders) and subcategories (sub folders) when you approach hundreds of plans and flows.
In order to monitor the status of the plan that has been running several different flows inside, in my case it is around 300, I send the HTTP request to Datadog to display the result of failed and success on a dashboard. The problem is, DATADOG understands only epoch timestamp and not the datetime value. Right now we cannot convert the timestamp into epoch. I was thinking of approaching this problem in the following ways:
1) Having a pre-request script
2) Creating dynamic parameters in Dataprep instead of using a fixed value, that can be used further in the HTTP request body
3) This is just the turnaround - Creating a table that stores the flow name and timestamp in it, and we are supposed to use this table in a plan every time we are running a flow. But this is not the right way. It will work but it is waste of time as we will end up creating separate tables like this one for each flow.
I'm looking for a way to discover which datasets, recipes, or outputs are taking up the most time and resources.
it would also be nice if we were able to view this over time as well.
an example would be sometime like the Unity3d profiler.
https://docs.unity3d.com/uploads/Main/profiler-window-layout.png
this is for a video game engine, but i hope the system can be similar.
in this profiler you can see what resource (ram,cpu, gpu) is being used and by what character/object in your video game.
similarly it would be nice to see what database is being used by what flow in trifacta.
Current syntax for WORKDAY function is workday(date1,numDays,[array_holiday]), and the array_holiday can't be a column a table, for example when there's any unpredictable non-trading days like Typhoon weather, we always need to go and change the public holidays in recipe, would prefer if the holidays can be from a column in a table that we can just import and update the table when needed.