This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I have a pipe delimited file, which I am bringing in using the input tool and reading it in as a delimited text file.
I need to find all fields with extra white spaces, but when I use REGEX_CountMatches, Nothing is returned. It appears that the white space is cleaned before I even do anything with the file. I am essentially running a data quality check on a file that a partner of ours will be sending us, so I need to be able to find all of the white spaces in the file, so they can correct this before sending it through an automated process.
I need a way to return the spaces after 1 and the spaces before two, but alteryx is cleaning this up! How do I get Alteryx to keep the spaces??
I'm not sure if this is standard behaviour or only happens with pipe delimiters, but you can get around it this way
Change the delimiter in your Input to \0(no delimiter).
Then split the field on the pipe and use a dynamic rename to get the field names
Edit: I checked with other delimiters and this behaviour occurs in all cases. According to the main wiki article on CSV, removing the leading and trailing spaces makes this implementation non-compliant with RFC-4180. It may be something you want to bring up as an issue.
A workaround would be using no delimiters (actually, \0 delimiter) and Text to Columns right after, but this would take extra work, since you would have to use a fixed number of columns or to do some tranposing if you split to rows.
Thanks! This did work. I had to make one change. Since the actual file I am working has 87 columns, I had to change the field length in the input file. Without changing the field length, I was losing more than half of my header fields.