community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Keeping some Strings and loosing others

This is my first post, and i have only used Alteryx a few times for small solutions, so im really just after a pointer in the right direction.

 

A little back ground I have an extract from one of our systems that basically holds the text of a conversation. In order to mine this data we are currently sending it out of the business to a third part for processing. We have to do a cleanse where any identifiable data is removed before sending. Now historically i did this using Parse and the RegEx tool which swapped all numbers [0-9] to 9, and that was acceptable.

 

So if as an example the original text was

     I have called 2 times on the 6th and 7th, and the order is 12 hours overdue please refund my credit card 1234-4567-7894-3214

It became

     I have called 9 times on the 9th and 9th, and the order is 9 hours overdue please refund my credit card 9999-9999-9999-9999

 

Now i have been asked to let some numbers through such as (where x is the number)

x times

xth

x hours

x substitutions

 

so the new cleaned string would be 

     I have called 2 times on the 6th and 7th, and the order is 12 hours overdue please refund my credit card 9999-9999-9999-9999

 

I know i can solve this using code (.net etc..) but can something be done in Alteryx? Is there a way to identify multiply matches in a string and then not parse them, but instead parse the remaining numbers to a 9.

 

Any help for this newbie would be appreciated

 

Thanks

Spence

 

 

 

 

Alteryx Certified Partner
Alteryx Certified Partner

I would suggest using the same methods that you are now, Regex and Parse but refining the Regex to only identify cases that need substitution such as credit card numbers, phone numbers, etc.

Director, Customer Enablement
Director, Customer Enablement

@SpenceInAsda

I have a crude RegEx that works for this case and is attached. Right now, this will only work for card numbers in the format of 4digit-4digit-4digit-4digit. This will not work for something like an Amex, but thought it might give you some direction to see one example.

Highlighted

Yes that would be the simple method , but the problem comes mainly from the security side I.e. its  business rule issue not a developer issue or at least i hope its not :smileyhappy:

 

We have to specify what data is going out of the business and get security to sign off, because we now need to add these extra elements of data to do deeper analysys at the third party, we have to be specific about what we are letting through, not what we are blocking :smileyfrustrated:. Also whilst credit cards, telephone numbers, and emails etc.. are fairly easy to find. Something such as account number just a list of 8 numbers is harder to pickout , and there is no way to ever bottom out the list of options to parse out

 

I know its a bit left field, but its just one of those situations, And i was hoping there might be a way in Alteryx so  i didnt have to use code.

Alteryx Certified Partner
Alteryx Certified Partner
\d{4,16}

That would match 4 to 16 digits and replace with 9.

269-555-1234 becomes 269-555-9999
1234-5678-1234-5678 becomes 9999-9999-9999-9999

Would that help?

Mark

Sent from my iPhone
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.

Thanks JoeM

 

I have a good Regex that covers all credit cards i borrowed from this site http://www.regular-expressions.info/creditcard.html

 

but its the same problem i need to define what im letting through, not what im blocking out

 

 

Alteryx
Alteryx

I think the regex you are looking for is this:

 

\d(?!(\d* times|\d*th|\d* hours|\d* substitutions))

 

In a regex replace with replacement text 9

 

The key is the the negative look ahead ?!

 

Check out this link for the full explanation of how it works

 

 

https://regex101.com/r/xM3jD7/1

 

Adam Riley
Principal Software Engineer
Tech Lead Core Engines, Alteryx

Thats the solution i was after. I never knew you could do a negative search with regex, and they way you showed the solution on that website was really helpfull. I think i might borrow that method myself. :smileyvery-happy:

 

Thanks to all for helping the newbie out. :robotembarrassed:

 

Labels