Hi Alteryx community!
I'm working on extracting dates from free-form text in user descriptions. Snippet below. I've attached an Excel file as well.
Here's what I've tried so far:
1. Creating regex patterns for the date formats:
- New data often includes date formats I haven't accounted for, which means some dates may be missed
- With the large volume of data, manually spotting dates that regex misses (and distinguishing them from text without dates) is challenging
2. Using the Named Entity Recognition tool:
- The NER tool hasn't been tagging dates accurately, and I can't seem to find any additional settings to improve its precision
Any insights or alternative methods to automate this process would be greatly appreciated. Thank you in advance!
hi @suwenchuan
I'm not sure this is the one you want or you already tried...but please try Regex tool with 'Tokenize' mode like this.
Hi @gawa, thank you for your reply. I'd like to add a bit more context to my query. Basically, the raw data that I receive will only be column "Description". what I need to get is the "Desired Output",
@suwenchuan OK...then it becomes very difficult problem...I need sometime to come up with solution.
