Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Include Non-breaking Spaces in DataCleansing and Data Grid Warnings

Hi,

 

I wasted a good old chunk of time dealing with non-breaking spaces, and Alteryx could be improved by handling this automatically.

 

A space is a space, right?  Nope, there are spaces (ASCII value decimal 32) and there are non-breaking spaces (ASCII value decimal 160).  They look the same, but have slightly different behaviour in certain circumstances, like when text is auto-wrapped.

 

The DataCleansing tool cleans spaces, but leaves non-breaking spaces.

The Data Grid puts a warning on cells with leading or trailing spaces, but remains silent for non-breaking spaces.

 

I was trying to match two strings, that looked identical.  I had DataCleansed my cells, and the grid was showing me nothing wrong with the data.  In desperation, I copied the two data cells that I expected to match to a text editor (Textpad), and then examined the binary ASCII values of the data.  One cell had a trailing non-breaking space, and that caused the failure to match.

 

This was hard to find.  For someone less hopelessly nerdy, it would be practically impossible. 

 

As a small change, it might be really useful for Alteryx to include non-breaking spaces in it's definition of "space", such that DataCleansing tool removes it, and the Data Grid flags up the cell as having a leading or trailing space.

 

You could pick up non-breaking spaces from HTML, or from Excel.  I think mine came from a SQL script but I am not sure how it was there.  They are out there, and they will bite.

3 Comments
SeanAdams
17 - Castor
17 - Castor

Hey @Hiblet

 

You can trim this kind of thing by using a RegEx to pull out any characters other than [A-Z]; [0-9] etc - but I agree that this is not optimal.

As you say - it would be good to just include a filter in the cleansing tool by default that removes non-visible characters.

 

Not sure if you've played with this but in the interim you can add this to the Data Cleanse tool yourself - it's a macro so you can add this filter yourself if you need it for later?

Hiblet
10 - Fireball

Hi @SeanAdams

 

I didn't realise I could edit the tool.  I have had a quick look and, yep, you can get into the Action tool and edit the formula...

 

  REGEX_REPLACE([_CurrentField_],'\s+','')

 

...which is fab.  A real eye-opener.  My Regex skills are not yet up to snuff enough to know what to use to drop nbsp's, but a little googling will have to be done.

 

I was quite freaked out by the problem because it was quite hard to even find out what was wrong.  I had to use TextPad to open the file as binary to see the ASCII codes.  I knew about non-breaking spaces from writing HTML, but I was surprised that they had got into the data I had and tripped me up.

 

Thanks for the response, I have learned something very useful!

AlteryxCommunityTeam
Alteryx Community Team
Alteryx Community Team
Status changed to: Accepting Votes