community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
#SANTALYTICS

Gather all 9 clues to complete the final Weekly Challenge on Dec 16!

Learn More
SOLVED

Regex_Replace to remove carriage return + line feed

Alteryx Partner

I'm trying to remove carriage return+line feeds using the REGEX_Replace function in the formula tool.  In the screenshot below I have a simple test pipe delimited file with two fields, Description and Attempt, shown in Notepad ++ with the 'Show all characters' options selected so the carriage return+line feeds are visible.

Alteryx REGEX_01.png

I'm using the following REGEX_Replace function in the 'Description' field of my test file in order to test removing the extra Carriage Return+Line Feed showing in Line 4 of my test file:  REGEX_Replace(Description, "\r*\n*", "")

I took this tip from the following link:

https://community.alteryx.com/t5/Data-Preparation-Blending/Remove-new-line-characters/td-p/8053

For the output I'm wanting to end up with my header and four output rows of data because in my test scenario, rows 4 and 5 are really one row.

When I run my simple test and browse the results, the extra Carriage Return+Line Feed remains.  I've been playing around with this and cannot get it to work.

Below is a screenshot showing the results:

Alteryx REGEX_02.png

REGEX is new to me so I have a feeling I'm missing something basic.  Any insight is appreciated!

 

Stuart

Alteryx Certified Partner
Alteryx Certified Partner
Your challenge requires multi line formulas. Because field 1 can continue on the next line, I might first count the number of delimiters on each row. Then I would use the multi row formula to concatenate row data. After make each row into two parts, I could filter the second rows out of the data.

At this point you're a close to done. Parse the data into columns and I think that we are there.

Cheers
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Alteryx Alumni (Retired)

Hi @sd,

 

Not sure if the solution is logical when working with actual data, but in the spirit of RegEx being a pattern based searching language, the pattern I discovered based on your example was that a TRUE new line is ended directly after a character that is NOT a "space", or \s.

 

I first concatenated all the records (Summarize) using \n as the separator and then replaced any instances of \s[\r\n] (translates to SPACE directly followed by either a character return or new line character) with an empty space.

 

Lastly I use split the data back into separate rows and then into their own columns (Text to Columns).

 

Again, may not be the best solution, but one that works using the example you provided. If anything, might spark some new ideas on how you might solve.

 

Hope this helps!

Alteryx Partner

Hi jgo,

 

Thank you for your reply.  As it turns out, I'm working in version 10.1.7.12188 of Alteryx and I'm unable to open your file.  Is there anyway you can save it in an older version of Alteryx and resend?

 

We plan to upgrade in mid-February with the planned release.

 

I'm curious to see how your solution works... Thanks again.

 

Stuart

Alteryx Alumni (Retired)

@sd,

 

No problem. Here's a copy that's compatible with your version.

Highlighted
Alteryx Partner

jgo,

 

Thanks for the new file, I'm able to open it. 

 

I replaced the 'new 1.txt' file with the file I was using (which I've attached in this reply).  I noticed your file has "\0" listed as the delimiter and my file has pipes, so when I made the replacement to my file, I also changed the delimiter in the configuration section.  With that change I'm getting the following error in the Summarize tool:

 

"The field 'Description|Attempt' is not contained in the record {Action=Concat}.  RecordInfo:CreateRecord: A record was created with no fields.  I'm not sure what is causing this error and how to fix it.  Can you assist?

 

Also... side note.  Is there any way to attach more than one file in a Post on this Forum?  I can also send you my Alteryx file as well.

 

Stuart

Alteryx Alumni (Retired)

@sd

 

Since the delimiter is not being represented properly in the data, I purposely configured it to read the data as a non delimited file, or "\0". Once the data is corrected, I use the 2 "Text to Column" tools to split them back out.

 

I made a couple of modifications (attached) which will correct the headers in the output.

 

- Unchecked option 6 in the Input tool so that it doesn't use the first row as headers

- Added the "Dynamic Rename" tool at the end to use the first row as headers

 

Re: side note... you can select multiple files in the "Open" window by holding the CTRL button. May be challenging if your files are in different directories so you may want to make sure everything you'd like to include is in the same place. Feel free to post your workflow if you'd like me to review.

Alteryx Partner

jgo,

 

Thanks again for your help.  I do have your solution working and I think I see what all is going on. 

 

I don't believe I can use the solution because I cannot guarantee that a true new line is ended directly after a character that is not a space.  That was just how I set up my test data.

 

I'm still not sure of what you mean by “the delimeter is not being represented properly in the data”? The problem with my data is that it is not clean, there are fields that contain text strings and in those text strings, the person doing the data entry hit a keyboard key that resulted in a carriage return/line feed being included in the data. It’s all contained in one field. The pipe delimiter was the character I chose to separate the fields.

 

My original thought was if the carriage return/line feed was completely contained within a specific field, I would be able to replace the text in that specific field with some other character. (or a blank) For example, in the attached files (Alteryx designer and sample text file) I’m able to put in a condition to replace a text character with a different character. All I want to do is the exact same thing, only replacing the carriage return/line feed with a blank. I just want to confirm this cannot be done with a single function – and that a more involved solution such as the one you posted is needed. (All assuming I cannot change the input data.)

 

Again – I appreciate all of your assistance (and patience) with this request.

Alteryx Alumni (Retired)

Hi @sd,

 

You're right, what's potentially not being represented properly is the carriage return/new line character, not the delimiter.

 

It doesn't look like your workflow was attached, only the 3 column txt file.

 

Glad to hear that you've got a solution that's working for you, and yes, to your point, don't believe that there's a simple "single function" can correct this.

 

One other possibility to create a more dynamic solution would be to sample the first row and count how many delimiters are there and then use that value to start validating the remaining rows and create logic that would clean up the erroneous \n when it's under the expected delimiter count.

 

 

Alteryx Partner

jgo,

 

I tried to submit the Alteryx file (.yxmd) from my pc, however I received an error message on posting that said"  "The contents of the attachment doesn't match its file type."  So... I'm unable to post an Alteryx file to this forum.  Possibly this is because the server that hosts the Alteryx application at my workplace is behind a firewall with no Internet access, so I need to post to this forum from my PC - and my PC does not have Alteryx installed??  Not sure of the reason.  (Or, maybe it's some other reason...)

Either way, it's very frustrating that I cannot post an Alteryx file (.yxmd) to this forum.

 

Thanks again for your original suggestion.  I learned about a few new tools/objects!

 

Stuart

Labels