Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex_Replace to remove carriage return + line feed

sd
7 - Meteor

I'm trying to remove carriage return+line feeds using the REGEX_Replace function in the formula tool.  In the screenshot below I have a simple test pipe delimited file with two fields, Description and Attempt, shown in Notepad ++ with the 'Show all characters' options selected so the carriage return+line feeds are visible.

Alteryx REGEX_01.png

I'm using the following REGEX_Replace function in the 'Description' field of my test file in order to test removing the extra Carriage Return+Line Feed showing in Line 4 of my test file:  REGEX_Replace(Description, "\r*\n*", "")

I took this tip from the following link:

https://community.alteryx.com/t5/Data-Preparation-Blending/Remove-new-line-characters/td-p/8053

For the output I'm wanting to end up with my header and four output rows of data because in my test scenario, rows 4 and 5 are really one row.

When I run my simple test and browse the results, the extra Carriage Return+Line Feed remains.  I've been playing around with this and cannot get it to work.

Below is a screenshot showing the results:

Alteryx REGEX_02.png

REGEX is new to me so I have a feeling I'm missing something basic.  Any insight is appreciated!

 

Stuart

10 REPLIES 10
MarqueeCrew
20 - Arcturus
20 - Arcturus
Your challenge requires multi line formulas. Because field 1 can continue on the next line, I might first count the number of delimiters on each row. Then I would use the multi row formula to concatenate row data. After make each row into two parts, I could filter the second rows out of the data.

At this point you're a close to done. Parse the data into columns and I think that we are there.

Cheers
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
jgo
Alteryx Alumni (Retired)

Hi @sd,

 

Not sure if the solution is logical when working with actual data, but in the spirit of RegEx being a pattern based searching language, the pattern I discovered based on your example was that a TRUE new line is ended directly after a character that is NOT a "space", or \s.

 

I first concatenated all the records (Summarize) using \n as the separator and then replaced any instances of \s[\r\n] (translates to SPACE directly followed by either a character return or new line character) with an empty space.

 

Lastly I use split the data back into separate rows and then into their own columns (Text to Columns).

 

Again, may not be the best solution, but one that works using the example you provided. If anything, might spark some new ideas on how you might solve.

 

Hope this helps!

sd
7 - Meteor

Hi jgo,

 

Thank you for your reply.  As it turns out, I'm working in version 10.1.7.12188 of Alteryx and I'm unable to open your file.  Is there anyway you can save it in an older version of Alteryx and resend?

 

We plan to upgrade in mid-February with the planned release.

 

I'm curious to see how your solution works... Thanks again.

 

Stuart

jgo
Alteryx Alumni (Retired)

@sd,

 

No problem. Here's a copy that's compatible with your version.

sd
7 - Meteor

jgo,

 

Thanks for the new file, I'm able to open it. 

 

I replaced the 'new 1.txt' file with the file I was using (which I've attached in this reply).  I noticed your file has "\0" listed as the delimiter and my file has pipes, so when I made the replacement to my file, I also changed the delimiter in the configuration section.  With that change I'm getting the following error in the Summarize tool:

 

"The field 'Description|Attempt' is not contained in the record {Action=Concat}.  RecordInfo:CreateRecord: A record was created with no fields.  I'm not sure what is causing this error and how to fix it.  Can you assist?

 

Also... side note.  Is there any way to attach more than one file in a Post on this Forum?  I can also send you my Alteryx file as well.

 

Stuart

jgo
Alteryx Alumni (Retired)

@sd

 

Since the delimiter is not being represented properly in the data, I purposely configured it to read the data as a non delimited file, or "\0". Once the data is corrected, I use the 2 "Text to Column" tools to split them back out.

 

I made a couple of modifications (attached) which will correct the headers in the output.

 

- Unchecked option 6 in the Input tool so that it doesn't use the first row as headers

- Added the "Dynamic Rename" tool at the end to use the first row as headers

 

Re: side note... you can select multiple files in the "Open" window by holding the CTRL button. May be challenging if your files are in different directories so you may want to make sure everything you'd like to include is in the same place. Feel free to post your workflow if you'd like me to review.

sd
7 - Meteor

jgo,

 

Thanks again for your help.  I do have your solution working and I think I see what all is going on. 

 

I don't believe I can use the solution because I cannot guarantee that a true new line is ended directly after a character that is not a space.  That was just how I set up my test data.

 

I'm still not sure of what you mean by “the delimeter is not being represented properly in the data”? The problem with my data is that it is not clean, there are fields that contain text strings and in those text strings, the person doing the data entry hit a keyboard key that resulted in a carriage return/line feed being included in the data. It’s all contained in one field. The pipe delimiter was the character I chose to separate the fields.

 

My original thought was if the carriage return/line feed was completely contained within a specific field, I would be able to replace the text in that specific field with some other character. (or a blank) For example, in the attached files (Alteryx designer and sample text file) I’m able to put in a condition to replace a text character with a different character. All I want to do is the exact same thing, only replacing the carriage return/line feed with a blank. I just want to confirm this cannot be done with a single function – and that a more involved solution such as the one you posted is needed. (All assuming I cannot change the input data.)

 

Again – I appreciate all of your assistance (and patience) with this request.

jgo
Alteryx Alumni (Retired)

Hi @sd,

 

You're right, what's potentially not being represented properly is the carriage return/new line character, not the delimiter.

 

It doesn't look like your workflow was attached, only the 3 column txt file.

 

Glad to hear that you've got a solution that's working for you, and yes, to your point, don't believe that there's a simple "single function" can correct this.

 

One other possibility to create a more dynamic solution would be to sample the first row and count how many delimiters are there and then use that value to start validating the remaining rows and create logic that would clean up the erroneous \n when it's under the expected delimiter count.

 

 

sd
7 - Meteor

jgo,

 

I tried to submit the Alteryx file (.yxmd) from my pc, however I received an error message on posting that said"  "The contents of the attachment doesn't match its file type."  So... I'm unable to post an Alteryx file to this forum.  Possibly this is because the server that hosts the Alteryx application at my workplace is behind a firewall with no Internet access, so I need to post to this forum from my PC - and my PC does not have Alteryx installed??  Not sure of the reason.  (Or, maybe it's some other reason...)

Either way, it's very frustrating that I cannot post an Alteryx file (.yxmd) to this forum.

 

Thanks again for your original suggestion.  I learned about a few new tools/objects!

 

Stuart

Labels