Let’s talk Alteryx Copilot. Join the live AMA event to connect with the Alteryx team, ask questions, and hear how others are exploring what Copilot can do. Have Copilot questions? Ask here!
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to use regex to remove unwanted characters between # and #?

yashs
6 - Meteoroid

Hey guys!

I wanted to use regex to remove the unwanted data from my text file. It goes like this:

yashs_1-1606213449429.png

I have 2 doubts here:

1. I know my data always lies between the tags <pre> and </pre>. I have extracted this already.

But between the tags, sometimes ## comes too. I want to replace all the # and the text between them with a space. What's the best way to go about it?

 

2. Also, is it possible to extract the data from the data in the form 201808010000 (date and time)

 

Would really appreciate the help!

9 REPLIES 9
PhilipMannering
16 - Nebula
16 - Nebula

Assuming these hashtags appear on a single line, you could split to rows on \n and then simply use the expression,

 

#.*#

 

And replace it with nothing (or an empty string to be more precise) in the Regex Tool.

 

If you care to share some data, I'm sure this is possible.

atcodedog05
22 - Nova
22 - Nova

Hi @yashs 

 

Lets me answer your 2nd question yes you can extract datetime from the string

 

Formula

DateTimeParse(ToString([DATE]),"%Y%m%d%H%M")

Hope this helps 🙂

yashs
6 - Meteoroid

I tried this approach earlier. But as it removes any data between #, it also removes 'needed data 1'

If you notice in the pic I shared, 'Needed data 1' also falls between # in a way

 I have attached a file of my actual data. I need all the data which lies after this pattern 

'201701010538 TAF KROC' 

Except for the data which lies between # ofcourse

atcodedog05
22 - Nova
22 - Nova

-

 

Sorry looked at your post now.

atcodedog05
22 - Nova
22 - Nova

Hi @yashs 

 

Can you provide an excepted output.

yashs
6 - Meteoroid

@atcodedog05

I have not yet gone through your solution for the date issue. But the data for date is slightly different. In the format of data for the attached file, I can extract it by pulling out the first numbers.

yashs_0-1606215874157.png

In this format, you can see the data still lies between <pre> tags, but the date appears above in <td> tag, which is repetitive. The output from this type of data should be the same as output mentioned in the file. Each line ( From TAF AMD KMKE........0VC008=) should have respective date before it

 

To avoid confusion here, I have 2 types of inputs, one shown here in snip, and other attached in my previous reply named as testdata.txt. And output of 1 format, as attached here

atcodedog05
22 - Nova
22 - Nova

Hi @yashs 

 

I am able to get output as your expected output using some filter.

atcodedog05_0-1606216248249.png

Please check and let me know.

yashs
6 - Meteoroid

It's working!

Thank you so much for your help!! Really appreciate it!

atcodedog05
22 - Nova
22 - Nova

Happy to help 🙂 @yashs 

 

Cheers and Happy Analyzing 😀

 

Feel free to reach out if you face any issues 🙂

Labels
Top Solution Authors