For a simple check of duplicate values, you can highlight the data with your cursor, which will trigger all duplicates to be highlighted and you can view them easily by moving up and down the dataset.
To find out how many duplicate values there are in each row:
For the total number of duplicates:
For more information, click on the following,
https://docs.trifacta.com/display/PE/Deduplicate+Data
I'm not sure if this will work for string data. So I basically have a column of customer IDs and I want to make sure that every customer ID only occurs once. I'm not sure how the countpattern transform will accomplish this for me. How do I just see if customer ID "ABC" occurs once, customer ID "DEF" occurs once, etc?
I have the same question.. Did you get a respons?
Gina answered the original question, which was how to "tell" or "see" if there is duplicate data in a column -- always remembering that the data visible in Wranger is a sample and may not represent your entire data set, depending on how large that is.
Actually enforcing uniqueness is also possible: see the Deduplicate Data page in the docs (https://docs.trifacta.com/display/SS/Deduplicate+Data). In the simplest case, whole rows may be duplicated -- see the Deduplicate Transform section. More likely, the data will contain multiple, differing rows with the same primary key (e.g., customer ID). See the Deduplicate Rows Based on a Primary Key section. Note that you will probably have to do some normalization and/or sorting of the relevant column(s) first. And of course, under this approach the row with the first instance of a given primary key value wins.
Thnx.. we did it like this.. it works
image