Hi, I am fairly new to Alteryx. I am trying to clean up an excel sheet that has a wide variety of data in the field. The data represents account numbers in a client's general ledger but the export formatting causes some words (which is the description of the specific entry) to get placed in the same row as account numbers (EX. 123456454, 213213213RC, FSH32433432, Tops Paint (2003), John Doe.
In my field I only want to have the following:
123456454 |
213213213RC |
FSH32433432 |
It is too hard to individually filter out words as there are 1000's of entries. I have been playing around with Regex_Replace, and I feel like I am close, but I am just wondering if there is a way to search for data that has a letter character in a specific placement (i.e find data with any letter in the 4th character spot and then replace it with a blank). Because then that would remove things like "Top Paints (2003)" and "John Doe" but keep "FSH32433432" and "213213213RC" and similar because I know the 4th character will always be a digit for the account number.
Eventually, what I will do is replace the now blanks from John Doe and Top Paints (2003) and use the multi-row formula to drag down the actual account number (I have gotten that part to work).
Sorry if that was confusing, maybe I am thinking about this process wrong?
Solved! Go to Solution.
Hi @GCEagles,
If I understand correctly, you want to test if the 4th character in any given string is a number you want to include it. Attached is a workflow that does that (the Text to Columns and Data Cleansing tools are to create the list based on how you presented the data).
Hey @T_Willins
Thanks for the reply, instead of using the filter at the end that splits the results, is it possible to have the output where there is one column but it would look like this for instance, where the words are now just blank entries:
12345654 |
213213213RC |
FSH24433432 |
I can't just remove those rows with the words (Tops and John Doe) because there are other relevant columns of data. So, if they are blank I can use my multi row formula to drag down the numbered accounts above (i.e 12345654).
Yes. I updated the RegEx tool to change the output from string to numeric. That way anything that is not a number (including spaces) will be Null(). Then using a Multi-Row Formula tool, any line that has a Null() in the RegExOut field gets updated in the Field1 field with the line above, saving a step of having to blank out the non-account numbers and replace them later.