community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Designer is inconsistent with non-breaking spaces

Alteryx Certified Partner

See attached workflow. It appears Alteryx Designer (working in 2018.2.4) is not consistent with non-breaking spaces (UTF-8 \xC2\xA0 ) and the wording in Cleanse, Summarize and RegEx feels confusing,

 

Alteryx Certified Partner
Alteryx Certified Partner

@Ruud,

 

In the expression you are challenged by the format of the data across all rows of data.  I used a different expression (also I used the FORMULA tool to implement it) that appears to better match your data:

 

REGEX_Replace([Winning team], "(.*?)\s{0,1}[[:punct:]].*", '$1')

It is looking for the first case of ANYTHING that is followed by zero or one SPACES followed by any PUNCTUATION followed by anything.  It groups the first case of ANYTHING in a way that doesn't include a trailing space.

 

If you put that expression into a FORMULA tool the output is the cleansed winning team data.

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Highlighted
Alteryx Certified Partner

Hi Mark, I know how to solve this specific case (it's in the bottom of the workflow actually), but that's not what I'm going for. I feel like treating non-breaking spaces as a non whitespace is confusing, esp. because the Trim-tool and Data Quality panes treat at it in a different way than RegEx does.

 

I can explain why it happen, but that doesn't make it consistent or clear to the (average) user.

Alteryx Certified Partner
Alteryx Certified Partner

@Ruud,

 

I was focused on the issue with the formula rather than what you might be saying about the Data Quality screen.  capture.jpg

I've simplified the workflow to include 2 lines of data where the last character is ASCII (160) Non-Breaking Whitespace.  The Data Quality shows this as 100% ok.  The last record is "Chicago Cubs" with no trailing characters.  You can see that the values are different in the Hash and if you look at the right-most character you can see that it is blank and that it has 100% data quality too.

 

I agree with your assessment that this should be reviewed by the development team.

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Alteryx Certified Partner

Simplefied data set ("piepklein datasetje"), my favorite! Good showcase.

Labels