Users have asked for the ability to create new versions of recipes so that they can collaborate safely. Also there is a need to keep an audit history of changes. Trifacta has recipe level history but that does not fulfil the whole use case of version control.
Hello,
I need to store many variable RegEx in column to use it in MATCHES function (for example).
But Dataprep doesn't currently support a column as an input in a way that the pattern inside it is read as an actual regular expression.
I think this feature could be a great feature !
Thanks.
more informations about this cese : https://community.trifacta.com/s/question/0D53j00007kB5UmCAK/matches-function-using-pattern-regex-st...
Currently, when a recipe is copied, any data qualities within the original recipe are not duplicated in the copied version. In order to implement a systematic data quality program, the rules must be manually created for every single recipe, which obviously takes a lot of time. It would be great if the data quality rules could persist when the recipe is copied.
At the moment, long formulas are very difficult to read because they cannot be "beautified". Instead of allowing for multi-line text and indentation, our formulas are in a single-line, wrapped textbox. It would be very helpful if Trifacta supported "beautification-enabled" textboxes for formulas so that we can write formulas that are easy to read and understand.
Currently you have one top level folder. In order to keep flows and other objects organized it would be nice to create folders within folders. For example create a top folder called "Sales Data" and a subfolder called "Sales Pipeline Data".
When our users request for productionizing a request, we require that users share their flow with our Data Engineering and Data Operations team. Currently sharing with individuals is working, but you have to remember to put everyone's name in there. It would be nice if you could enter a GCP Group name that contains the users so they only have to remember one user.
When migrating flows from one environment to another, currently one must click on each flow to export it, and then import it into another environment. It would be great to do so in bulk by ticking a bunch of flows to export. Or do it by exporting an entire folder. Or do it by exporting a plan.
We would like to be able to split users into different user groups within the same workspace. Permissions to view and edit outputs, job runs and flows (including for administrators) would be allocated on a user group level (e.g. so one user group can edit flows, the associated job runs and outputs while other groups can only view them). Administrators would be allocated to a particular user group and their admin rights would apply to their own group only.
Standardize is an amazing function! ... if you know that you won't have any more values added to a column later. With standardize, it's impossible to account for future source values.
It would be super helpful if there were a way to add additional Source Values (and, accordingly, New Values for those source values) to account for values that might appear in the future (but aren't in your data right now).
I realize there are already a number of ways to account for "future" values. Some examples include if...then...else, condition column > case on single column, condition column > case on custom column. However, these transforms are not friendly for those unfamiliar with coding in low-code tools, and this no-code upgrade to Standardize could help these users.
When i use Dataprep full day, i would prefer to have a dark theme to preserve my eyes ! If we could swith between a dark or a light theme, it would be the best solution !
:-)
PS : Sorry for my bad English, I'm a French user.
Right now there is no place where team members can collectively create flows and share at one place. If given the option to share the Folders among different members just like we have for flows it will be lot easier. For Example: If there is a folder with 4 different flows, and I share the folder with my team mates they can edit and created new flows over there and can see all the 4 different flows already present. But if out of 4 flows if I share 2 flows with someone, they see the folder but they don't see the flows not shared with them.
Please add features to your current "Folder" feature by allowing us to share them with other users, move a flow that has been "shared" with us from the root folder to a sub-folder, etc.
Someone has already submitted an idea for multiple levels of sub-folders which was another request that we had. Thanks.
Parquet has been more performant than publishing to csv and love to have this feature implemented
We use heavily Tibco Data Virtualization server views and web services in our organization.
We need to have official connector supported by Trifacta to connect and fetch data from those.
When I'm on the Flow workspace, if I do a click on a recipe, the steps displays on the float right box.
Unfortunately, i'can't select and copy steps :-/
I'm forced to load the recipe to copy steps before past it in another recipe. I think that everyone would gain some time to copy steps directly from the Flow view !
:-)
PS : Sorry for my bad English, I'm a French user.
Currently, when pivoting a specific field into multiple columns, all other fields you want present in the resulting table must be individually added to "row labels".
First off - It is very time consuming when you have a lot of columns to add.
Secondly - When new columns are added in the source data, these new fields are not automatically included. When this happens we need to:
When using an automation tool such as Trifacta I would expect that my flow can deal with new columns being added without having to go and fix my flow every time. Adding an option to add "All other fields" or being able to select the Fields to exclude would make this process much smoother and it would ensure that our flow is future proof.
Currently is scheduling allowed only on Instance level.
The request is to be able to allow scheduling for particular user, instead for all instance users.
Use a linked datasets created by GCP Analytic Hub as data source in DataPrep. Detailed informations in link below:
Can I use linked dataset (created by Analytic Hub in GCP) to build flows in DataPrep? (trifacta.com)
We often use hashing functions like fingerprint in SQL (Big Query) to mark or identify rows that match for specific attributes or to generate UUIDs. I know it's possible to do so by adding UDFs, but it would be more convenient to have a native function.