Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Alteryx is here to help you solve your biggest data challenges. Read about the new Virtual Solution Center here.

Using Data Cleansing to delete newline

Highlighted
7 - Meteor

I imported an xlsx into Alteryx, there is column of cells contains a lots of new line format. I tried to use data cleansing to delete it. It was successful however, it it leave a gap where the new line was e.g

 

tractor was go

ing to the farm

 

turns to 

 

tractor was go ing to the farm

 

Please any ideas?

Many Thanks

Highlighted
Alteryx Certified Partner

Hi @Felix776,

 

Could you add a sample data file?

 

Regards,

Jonathan

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

In some cases, what looks like a newline could be another character. I've had trouble in the past with a "vertical tab", which Excel likes to use. 

 

Here's a couple links that outline some different line tabulation characters:

https://www.regular-expressions.info/nonprint.html

https://stackoverflow.com/a/36572157 

Highlighted
7 - Meteor

Hi Jonathan,


Please find attached.

 

It could be as Charlie suggested something do with Vertical tab, which also included a lot of examples. Is there any recommendation how I could get rid of? 

 

It appears the issue I have only happen in the Narrative column

 

Many Thanks

Alteryx Certified Partner
Alteryx Certified Partner

@Felix776 Thanks for sharing your workflow.

 

FYI when you share workflows (.yxmd), it does not include the dependencies like the Excel file. You could either attach the Excel file separately, or export the workflow (Options>Export Workflow) which will include the Excel file in a compressed Alteryx format. 

 

Until we can test with the actual data, here's something to try:

 

REGEX_Replace(
[Narrative]+[Narrative 5]+[Narrative 6]+[Narrative 7]+[Narrative 8]+[Narrative 9]
,"\W","")

 

Update your Formula tool with the expression above. This will use RegEx to replace any non-word character (referenced by "\W") with nothing (referenced by ""), effectively deleting it. 

Highlighted
7 - Meteor

Hi Charlie,

 

I have upload the xlsx this time.

 

Sorry about that. 

 

Regards


Felix

Highlighted
7 - Meteor

I tried your formula, I think it works to well! I deleted all the spaces!

Highlighted
Alteryx Certified Partner

Hi @Felix776,

 

You could use a Regex_Replace() function to remove the line breaks. I've used the following function inside a multi-field formula to perform the function on all fields at once. Then combining the columns using the formula tool.

 

REGEX_Replace([_CurrentField_], '\n', '')

 

image.png

 

image.png

 

If this solves your issue please mark the answer as correct, if not let me know! I've attached my workflow for you to download if needed.

 

Regards,

Jonathan

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

@Felix776 wrote:

I tried your formula, I think it works to well! I deleted all the spaces!


How about this?

 

REGEX_Replace(
[Narrative]+[Narrative 5]+[Narrative 6]+[Narrative 7]+[Narrative 8]+[Narrative 9]
,"[^\w\s]","")

 

This version will remove anything that isn't a word character (a-z,0-9,_) or a space (\s)

Labels