Hi All,
I'm attempting to use RegEx to parse a comment field into multiple blocks, then find the commenter, time, and comment. I'm having trouble with the RegEx to break out the comment blocks at the moment. Does anyone know how to best solve this problem or have some links for leads?
Details on the problem below!
Input:
| ID | Comment |
| 1 | (abcd0ef.abcd0ef - 1/1/2019 2:00 PM) Random Free Text (xyz0ef.abc0ef - 1/2/2019 1:00 AM) Response |
| 2 | (a1cd0ef.abcd0ef - 12/1/2018 2:00 PM) Random Free Text (l0n0op.ghi0yf - 12/31/2018 1:00 AM) Other Random Free Text (system administrator - 1/3/2019 6:00 PM) More text |
| 3 | (Donny Darko - d10d00.hai0sd - 12/15/2018 8:00 AM) Other text (System Administrator - 12/15/2018 8:31 AM) Notes notes notes |
Desired Output:
| ID | Commenter | DateTime | Comment |
| 1 | abcd0ef.abcd0ef | 1/1/2019 2:00 PM | Random Free Text |
| 1 | xyz0ef.abc0ef | 1/2/2019 1:00 AM | Response |
| 2 | a1cd0ef.abcd0ef | 12/1/2018 2:00 PM | Random Free Text |
| 2 | l0n0op.ghi0yf | 12/31/2018 1:00 AM | Other Random Free Text |
| 2 | l0n0op.ghi0yf | 1/3/2019 6:00 PM | More text |
| 3 | d10d00.hai0sd | 12/15/2018 8:00 AM | Other text |
| 3 | System Administrator | 12/15/2018 8:31 AM | Notes notes notes |
Here's what I am thinking for process:
1. Tokenize comment blocks -- e.g. "(abcd0ef.abcd0ef - 1/1/2019 2:00 PM) Random Free Text"
- This partially works if I use parse instead of tokenize, but I can't get the regex to capture the blocks
- Currently using: (\(.*\))
2. Parse commenter, datetime, comment
Here are links to other posts that have been helpful so far:
Regex Parsing Help Needed With Special Characters
RegEx Get Repeated Parse Matches
Extract Date from Text
Regex Perl Syntax Guide & Regex Cheat Sheet
Challenge 11 - Identify Logical Groups
Attached is a file with my current work and a sample solution from another post on the Community.