Help with regular expressions
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello Everyone,
I am trying to parse the pdf files and struck with splitting the row with dynamic columns. I need to split the row multiple columns as shown below.
split the TEXT to 8 columns
tried using the below formula but it’s not working
REGEX_Replace([YourField], "^(\\d+)\\s+([^\\d]+)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(.+)$", "$1|$2|$3|$4|$5|$6|$7|$8|$9")
Much appreciated your help to resolve this issue
thanks
Solved! Go to Solution.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Have you tried Text to Columns Tool with \t as the separator? This looks like it may provide the simplest solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi ,
I have tried text to columns but it’s not working as expected
Thanks,
sree
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Skotha123456 can you upload a sample file? is it delimited with space or tab?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
If it's delimited with multiple spaces, you can use the RegEx Tool, configured to Tokenize to 8 columns with the following expression:
(.*?)(?:\s{2,}|$)
This will capture all items that are delimited by 2 or more spaces.
If that doesn't work, then I'm with @binuacs. Sample data would be helpful.
Hope this helps and Happy Solving!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Upload data not images. I don't knon where you got your formula but it's very wrong.
REGEX_Replace([YourField], "^(\\d+)\\s+([^\\d]+)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(.+)$", "$1|$2|$3|$4|$5|$6|$7|$8|$9") --- makes no sense for what you are trying to do. you want all 9 columns. your formula looks for 9 or statements. every single time the first time will create a TRUE so you will never look at Or statement 2.
try throwing this into the regex tool in parse mode change it by:
1) - replace your \\ with \ ---- \\ in perl/Alteryx regex - \\ would mean the character "\" --- so you do not want this. Don't trust gemini/chat gpt for this.
2) remove your | --- you don not need ors - you are trying to parse a single field for 9 distinct columns.
^(\\d+)\\s+([^\\d]+)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(\\d{1,3}(?:,\\d{3})*|-)\\s+(.+)$
follow those instructions and see what maps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks everyone, @flying008 many thanks and i took your suggestions and working fine with a small change
