Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

regular expression to remove duplicates

juan1
7 - Meteor

Dear all, 

 

I have been trying to wrap my head around regular expressions for a problem that I have. I have reviewed similar posts but none worked. 

The problem is that I have a set of values that were duplicated from a text to columns before. I wanted to separate into rows strings that had "/". I tried using the following RegEx

^(.*)(\r?\n\1)+$

and replacing with \1. 

 

This is my data; I want to remove one of the duplicate "3"s.

Valuetotal
51803313
51803313
518033185
51803323
51803323
518033220

 

Desired output: 

Desired output 
Valuetotal
5180333
51803385
5180333
51803320

 

Thank you in advance for your time and help.

 

Best,

Juan1

5 REPLIES 5
MarqueeCrew
20 - Arcturus
20 - Arcturus

@juan1 ,

 

 have you considered using the SAMPLE tool?  Group by value and skip first record?  

https://help.alteryx.com/current/designer/sample-tool

 

then you can use the left () function to remove the last digit?

 

 Cheers,

 

 mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
echuong1
Alteryx Alumni (Retired)

I am a bit confused by your description and example. Are you just trying to remove the duplicates based on the Total-Value pairs, and then trim the last digit of the Value?

 

If so, you can use a unique with both total and value checked. This will remove the duplicates and your unique values will come out of the U output. From there, you can use regex to trim the last number. You can also achieve this with a formula tool.

 

echuong1_1-1613526800054.png

 

 

Ben_H
11 - Bolide

Hi @juan1 

 

The only thing I have to add is to ask why your text to columns introduced duplicates in the first place?

 

Wouldn't it be easier to fix that than have to have an extra step?

 

Regards,

 

Ben

 

jf97hernandez
7 - Meteor

@MarqueeCrew 

 

Thank you for your suggestion! The sample tool worked marvelously; I solved it without the need of the LEFT function. 

 

For anyone looking at this in future searches:

 

jf97hernandez_0-1613586278918.png

 

My data totaled 50 so I chose N=50.

 

Best,

Jf97hernandez

juan1
7 - Meteor

@Ben_H

 

I had used a text to columns before and the values have distinct categorizations. Imagine it like the same transaction number for a credit card purchase in which i divided between the cost and the credit card usage fee. That is why I had duplicates

So, before getting to this, I used the text to columns to split into rows to clean some data and had the original value more than twice (which was what I needed). In the end I got rid of the unnecessary duplicates using the sample tool.

 

Best,

Juan1

Labels