I have a dataset with a bunch of RTF tags, which I need to strip and keep the plain text. The output will be a CSV, so I don't need to keep any formatting, only the text.
Here are some examples of the tags and what the expected output after stripping is.
| Input with RTF | Expected Output |
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Microsoft Sans Serif;}} \viewkind4\uc1\pard\f0\fs17 left vm \par } | left vm |
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Microsoft Sans Serif;}} \viewkind4\uc1\pard\f0\fs17 $1894.04 EFT\par } | $1894.04 EFT |
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Microsoft Sans Serif;}} {\colortbl ;\red0\green0\blue0;} \viewkind4\uc1\pard\cf1\f0\fs16 N/a, didn't leave a vm, will f/up\cf0\fs17\par } | N/a, didn't leave a vm, will f/up |
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Microsoft Sans Serif;}{\f1\fnil Microsoft Sans Serif;}} \viewkind4\uc1\pard\f0\fs17 12/19/14- lef vm regarding saturday meeting again. will follow up in January 2015 if they do not call back.\f1\par \par \f0 12/15/14- left vm regarding saturding meeting week of the 22nd, will follow up Friday if I do not hear back.\par \par \par 12/4- left vm. MD\f1\par \par \f0 11/26- left vm. MD\f1\par \par \f0 11/4- left vm. MD\f1\par \par \f0 thx for attending seminar. confirming Nov 1st sat, 1 or 3pm??\par } | 12/19/14- lef vm regarding saturday meeting again. will follow up in January 2015 if they do not call back.12/15/14- left vm regarding saturding meeting week of the 22nd, will follow up Friday if I do not hear back.12/4- left vm.11/26- left vm.11/4- left vm.thx for attending seminar. confirming Nov 1st sat, 1 or 3pm?? |
Is this something that can be done in Alteryx? Possibly with a Regex?