Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Field should contain either alphabets A-Z, numbers 0-9, and certain special characters

JHAROHIT
7 - Meteor

Hi,

 

I have a requirement to check in a field that it should contain only alphabets A -Z (both upper and lower case) , numbers 0 - 9, symbols like   .,-,\ and latin set  like ÀÈÌÒÙàè.

 

Can someone please guide how should i be writing the regex expression and also explain the syntax.

 

Appreciate your help in advance, Thank you.

7 REPLIES 7
cjaneczko
13 - Pulsar

Try the following in a formula tool.

 

REGEX_Match([Field1], '.*[A-Za-z0-9.,\-\\ÀÈÌÒÙàè].*')
OllieClarke
15 - Aurora
15 - Aurora

Hi @JHAROHIT you might need to be a bit more specific here. RegEx is very precise.

 

For instance, in Alteryx the token '\w' contains all uppercase and lowercase latin letters, all numbers, underscores, and all letters from non-latin alphabets (e.g. greek, sanskrit etc)

 

If that's okay with you, and you don't want underscores then the following filter should work:

 

REGEX_Match([Field],'[\w.\-\\]+') AND !CONTAINS([Field],'_')


You can construct a set of ranges of acceptable characters (escaping those with RegEx meanings). This would only match unaccented latin letters, numbers, fullstops, hyphens and backslashes.

 

[A-Za-z0-9\.\-\\]+

 

 

You could continue to add accented characters to this set, either by inputting them as a list, or putting in the first and last characters in their unicode range (as with A-Z). These ranges are easily found on ASCII tables, or Wikipedia, or google, but as I say we need to know exactly which character should be allowed and which ones shouldn't

 

Hope that helps,

 

Ollie

 

JHAROHIT
7 - Meteor

Thank you, i tried it in a sample dataset.

 

My code in the formula tool is as below: 

If

REGEX_Match([Field 1], '.*[A-Za-z0-9.,\-\\ÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜŸäëïöüŸ¡¿çÇŒœßØøÅåÆæÞþÐð].*') = 0
then
'F'
else
'T'
endif

 

However, i tried to insert a '$' between the allowed characters and it did not catch it.

 
Field 1ValidityValidity_1
1234-1T
abcdefg-1T
1-2-/4/5/,-1T
&%$#@0F
ÇŒœßØø-1T
abc$12345-1T
123ÀÈÌÒÙàèìòùÁ3456-1T

 

 
 

Although i have not coded '$' in the allowed characters.

 

 

 

JHAROHIT
7 - Meteor

Thank you Ollie,

 

I think now i now understand how it works.

 

However, special character in middle of the allowed characters, that is not getting caught in the exception. Here i have inserted two characters '$' and 'ݿ'.

While passed separately it works but not in mid of allowed characters.

 

My code:

 

If

REGEX_Match([Field 1], '.*[A-Za-z0-9.,\-\\ÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜŸäëïöüŸ¡¿çÇŒœßØøÅåÆæÞþÐð].*') = 0
then
'F'
else
'T'
endif

 

 

Sample Input and output:

Field 1ValidityValidity_1
1234-1T
abcdefg-1T
1-2-/4/5/,-1T
&%$#@0F
ÇŒœßØø-1T
$0F
abc$12345-1T
123ÀÈÌÒÙàèìòùÁ3456-1T
abcݿ123445-1T
qbc12345-1T
ݿ0F
OllieClarke
15 - Aurora
15 - Aurora

Hi @JHAROHIT 

 

REGEX_MATCH() will always compare to the whole string.

 

If you want to only allow strings with those characters then

 

REGEX_MATCH([Field],'[A-Za-z0-9\.\,\-\\ÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜŸäëïöüŸ¡¿çÇŒœßØøÅåÆæÞþÐð]+')

will return true for those, and false if a string contains at least 1 other character

Your '.*' at the beginning and end of your match means that your current formula returns true if there is at least 1 character from the set you created

 

Does that help/make sense?

 

Ollie

cjaneczko
13 - Pulsar

How about this.

 

If

REGEX_Match([Field 1], '.*[A-Za-z0-9.,$\\\-ݿÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜŸäëïöüŸ¡¿çÇŒœßØøÅåÆæÞþÐð].*') = 0
then
'F'
else
'T'
endif
JHAROHIT
7 - Meteor

Thank a lott, Ollie.

 

That helps and solves my issue.

 

Appreciate it.

Polls
We’re dying to get your help in determining what the new profile picture frame should be this Halloween. Cast your vote and help us haunt the Community with the best spooky character.
Don’t ghost us—pick your favorite now!
Labels