Duplicate Records
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello Alteryx community! I am hoping you can help me with the below questions as I am a super new user to Alteryx! I need to use a custom workflow to check the following:
- Are there duplicate records in the RTT_SUMMARY All reduced.csv table? If so, how many?
- After removing duplicate records, does OPA_ACCOUNT_NUM determine STREET_ADDRESS?
- Does OPA_ACCOUNT_NUM determine ZIP_CODE?
- What should be the primary key for the RTT_SUMMARY All reduced.csv table? (Hint: Data may not currently support that)
- Labels:
- Data Investigation
- Datasets
- Help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @amcgill2 ,
I'm not sure what you mean by "determines" does OPA_ACCOUNT_NUM = Street_Address? Does it relate in some way?
In the data you have provided, OPA_ACCOUNT_NUM does not equal Street-Address or Zip_Code.
I have attached a workflow that checks for duplicates, but given you haven't told us what the key would be to determine duplicates, I've just checked the situation where it is an exact duplicate on all columns.
I removed these duplicate rows using the unique tool.
The primary key is unknown but looks like it should be Document_ID, however with the duplicates there's no way to confirm.
M
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
