cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!

Include Non-breaking Spaces in DataCleansing and Data Grid Warnings

Hi,

 

I wasted a good old chunk of time dealing with non-breaking spaces, and Alteryx could be improved by handling this automatically.

 

A space is a space, right?  Nope, there are spaces (ASCII value decimal 32) and there are non-breaking spaces (ASCII value decimal 160).  They look the same, but have slightly different behaviour in certain circumstances, like when text is auto-wrapped.

 

The DataCleansing tool cleans spaces, but leaves non-breaking spaces.

The Data Grid puts a warning on cells with leading or trailing spaces, but remains silent for non-breaking spaces.

 

I was trying to match two strings, that looked identical.  I had DataCleansed my cells, and the grid was showing me nothing wrong with the data.  In desperation, I copied the two data cells that I expected to match to a text editor (Textpad), and then examined the binary ASCII values of the data.  One cell had a trailing non-breaking space, and that caused the failure to match.

 

This was hard to find.  For someone less hopelessly nerdy, it would be practically impossible. 

 

As a small change, it might be really useful for Alteryx to include non-breaking spaces in it's definition of "space", such that DataCleansing tool removes it, and the Data Grid flags up the cell as having a leading or trailing space.

 

You could pick up non-breaking spaces from HTML, or from Excel.  I think mine came from a SQL script but I am not sure how it was there.  They are out there, and they will bite.

2 Comments
Aurora
Aurora

Hey @Hiblet

 

You can trim this kind of thing by using a RegEx to pull out any characters other than [A-Z]; [0-9] etc - but I agree that this is not optimal.

As you say - it would be good to just include a filter in the cleansing tool by default that removes non-visible characters.

 

Not sure if you've played with this but in the interim you can add this to the Data Cleanse tool yourself - it's a macro so you can add this filter yourself if you need it for later?

Asteroid

Hi @SeanAdams

 

I didn't realise I could edit the tool.  I have had a quick look and, yep, you can get into the Action tool and edit the formula...

 

  REGEX_REPLACE([_CurrentField_],'\s+','')

 

...which is fab.  A real eye-opener.  My Regex skills are not yet up to snuff enough to know what to use to drop nbsp's, but a little googling will have to be done.

 

I was quite freaked out by the problem because it was quite hard to even find out what was wrong.  I had to use TextPad to open the file as binary to see the ASCII codes.  I knew about non-breaking spaces from writing HTML, but I was surprised that they had got into the data I had and tripped me up.

 

Thanks for the response, I have learned something very useful!