Hello,
I need to store many variable RegEx in column to use it in MATCHES function (for example).
But Dataprep doesn't currently support a column as an input in a way that the pattern inside it is read as an actual regular expression.
I think this feature could be a great feature !
Thanks.
more informations about this cese : https://community.trifacta.com/s/question/0D53j00007kB5UmCAK/matches-function-using-pattern-regex-st...
Currently, when a recipe is copied, any data qualities within the original recipe are not duplicated in the copied version. In order to implement a systematic data quality program, the rules must be manually created for every single recipe, which obviously takes a lot of time. It would be great if the data quality rules could persist when the recipe is copied.
At the moment, long formulas are very difficult to read because they cannot be "beautified". Instead of allowing for multi-line text and indentation, our formulas are in a single-line, wrapped textbox. It would be very helpful if Trifacta supported "beautification-enabled" textboxes for formulas so that we can write formulas that are easy to read and understand.
Standardize is an amazing function! ... if you know that you won't have any more values added to a column later. With standardize, it's impossible to account for future source values.
It would be super helpful if there were a way to add additional Source Values (and, accordingly, New Values for those source values) to account for values that might appear in the future (but aren't in your data right now).
I realize there are already a number of ways to account for "future" values. Some examples include if...then...else, condition column > case on single column, condition column > case on custom column. However, these transforms are not friendly for those unfamiliar with coding in low-code tools, and this no-code upgrade to Standardize could help these users.
We often use hashing functions like fingerprint in SQL (Big Query) to mark or identify rows that match for specific attributes or to generate UUIDs. I know it's possible to do so by adding UDFs, but it would be more convenient to have a native function.
Hi,
I use the import by folder for GCS files to import many files in one time (present and futurs files droped in the same folder). Somtimes, the dataschema of files is not exactly the same for all the files but the columns names are always the same ! I'would like to use the "union by name" for the first union of the many files included in the folder that i've imported. With this function, if the dataschema change in the futur, my importation will be ok whatever !
We could have a screen like "recipe union screen" for the "import with union" (for the inports by folder) to select the columns to import and the type of matching for exemple...
This is a real issue for me because when the datascema of one file has changed, the scheduled RUNs are KO...
Sorry for my bad English, I'm French :-)
Thanks !
I often receive data sets which have rows above the column headers that I don't need. When importing the data set, there is a dropdown on the edit menu to "make the first row a column header". However, I would like for this dropdown to include an option to for example, "make row 20 the column header and delete all preceding rows". This would allow me to import the data already with column headers. When dealing with one dataset, I can always choose any row to make it the column headers, but when you have to join 20 similar datasets, it is not possible to do the same. Not sure if my idea is clear (lol), but it seems like it's something that could be easily incorporated into the tool. Thanks!
When you select a column for apply function or transformation, the methods to select columns are :
But this is not possible to select column with a "RegEx math" method on the name of the columns.
It would be much easier!
The ability to apply various interpolation methods (cspline, linear, etc.) between sorted columns of integers.
It would be great if you can expand the metadata selection to not be limited to 2 elements (row number and file path) but could potentially add the date timestamp (e.g. $datecreated) to be used in the recipes.
Current syntax for WORKDAY function is workday(date1,numDays,[array_holiday]), and the array_holiday can't be a column a table, for example when there's any unpredictable non-trading days like Typhoon weather, we always need to go and change the public holidays in recipe, would prefer if the holidays can be from a column in a table that we can just import and update the table when needed.
Current NIST/NSA standard is SHA-2.
As a data wrangler, I would like to be able to hash a column's data using the SHA-256 hashing algorithm.