community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Designer is inconsistent with non-breaking spaces

Alteryx Certified Partner

See attached workflow. It appears Alteryx Designer (working in 2018.2.4) is not consistent with non-breaking spaces (UTF-8 \xC2\xA0 ) and the wording in Cleanse, Summarize and RegEx feels confusing,

 

Alteryx Certified Partner
Alteryx Certified Partner

@Ruud,

 

In the expression you are challenged by the format of the data across all rows of data.  I used a different expression (also I used the FORMULA tool to implement it) that appears to better match your data:

 

REGEX_Replace([Winning team], "(.*?)\s{0,1}[[:punct:]].*", '$1')

It is looking for the first case of ANYTHING that is followed by zero or one SPACES followed by any PUNCTUATION followed by anything.  It groups the first case of ANYTHING in a way that doesn't include a trailing space.

 

If you put that expression into a FORMULA tool the output is the cleansed winning team data.

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Highlighted
Alteryx Certified Partner

Hi Mark, I know how to solve this specific case (it's in the bottom of the workflow actually), but that's not what I'm going for. I feel like treating non-breaking spaces as a non whitespace is confusing, esp. because the Trim-tool and Data Quality panes treat at it in a different way than RegEx does.

 

I can explain why it happen, but that doesn't make it consistent or clear to the (average) user.

Alteryx Certified Partner
Alteryx Certified Partner

@Ruud,

 

I was focused on the issue with the formula rather than what you might be saying about the Data Quality screen.  capture.jpg

I've simplified the workflow to include 2 lines of data where the last character is ASCII (160) Non-Breaking Whitespace.  The Data Quality shows this as 100% ok.  The last record is "Chicago Cubs" with no trailing characters.  You can see that the values are different in the Hash and if you look at the right-most character you can see that it is blank and that it has 100% data quality too.

 

I agree with your assessment that this should be reviewed by the development team.

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Alteryx Certified Partner

Simplefied data set ("piepklein datasetje"), my favorite! Good showcase.

Labels