Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Remove special characters - different excel files except slash (/) in dates

PassION_es
10 - Fireball

Hello Alteryx Team,

 

I have a workflow that converts excel file to .dat extension format using csv.  The workflow already generate an output in .dat format however, it does not have yet cleanse the output. I need some help in the cleanse tools or any tool that will remove all special characters except to dates with /slash (date format is 01/01/2000) and except for period inside the amount (ex. 500000.25).

 

I only use cleanse tool when column headers are there.  However, in this case, the excel file has no header that is permanent and the first row contains data that is dynamic, changes every month. 

 

The question is, are there any tools other than cleanse tool that eliminates all special characters captured in excel file whatever data or number of columns/fields the excel file has as long as the slash(/) in all dates are retained? Also, is it possible to use one workflow to various excel files with different number of columns or different data set? We have different files that need to be converted from xlsx to .dat extension file and all these files may have different columns and dates are located in one or 2 of these columns.

 

Attached is the dummy input (excel) and workflow for your reference.  Thank you in advance.

 

 

33 REPLIES 33
FrederikE
13 - Pulsar

Hey @PassION_es,

 

You can try using a RegEx Function and restrict the Characters to the ones specified. Everything else will be dropped.

FrederikE_0-1666515593202.png

 

 

When you want to do this for multiple columns you can use this regex-function in a Multi-Formula Tool. 

PassION_es
10 - Fireball

Thanks @FrederikE ,

 

Would you mind to share in yxmd please? I tried using below tool, and tried running the workflow but  did not remove the special characters inside the file.

Please see below highlighted in yellow.

 

PassION_es_0-1666519498146.png

 

FrederikE
13 - Pulsar

Hey @PassION_es,

 

Here with your example file. 

PassION_es
10 - Fireball

Hi @FrederikE , Got it. Thank you.  I would like to remove also this slash (/) sign between words or combination of words or numbers, but not in dates.  Sorry I was not able to highlight below earlier.  I mean all special characters (colon, semicolon and others) should be disappeared from the output.  Dots or "Period" in figures/ amounts should also be retained:

 

PassION_es_1-1666525020941.png

 

Dots (Period) were disappeared in figures (amount). Dots should be retained:

PassION_es_1-1666525566747.png

 

 

PassION_es_0-1666525521045.png

 

 

Is this possible? Dates with format 01/01/2000 should be retained (with slash):

 

PassION_es_2-1666525186680.png

 

FrederikE
13 - Pulsar

Hey @PassION_es,

 

In this case, I'd separate the columns where you want to apply a different logic (shown with the Dates) and then change the RegEx and ex-/include the characters you want to keep. 

 

FrederikE_0-1666526867113.png

 

PassION_es
10 - Fireball

Hi @FrederikE , you almost got it. The only error I found is in the amount in figures where dot/period no longer existing.  Just like 432,530.75.  In output it becomes 43253075.  It should be with dot/period 432530.75.  The goal is that any amount in figures inside the data set should keep their dot/period on it, same with dates for slash (/) sign.  Thank you.

 

PassION_es_0-1666527679552.png

 

FrederikE
13 - Pulsar

Hey @PassION_es,

 

Right, there you go. 

 

FrederikE_0-1666607270557.png

 

PassION_es
10 - Fireball

hi @FrederikE , thank you for your patience.. I tried your workflow and it gives the desired result. However, I have another request. Actually, the highlighted cell is a sum of all amounts in column K. So that should also have dot and decimal points in it.

PassION_es_1-1666608780109.png

 

The sum  highlighted in yellow is not in .dot but instead has underscore before the decimal points.

PassION_es_3-1666609064598.png

Also, could it be possible that the position of  all data/fields will still be the same from the original? I mean the goal is to remove special characters only except on those figures and dates. Apologies, If I demanded too much. I have not gone yet to the reg_ex learning part but this workflow will be useful to us in Accounting when converting reports from excel file to another format (without special characters).

 

binuacs
21 - Polaris

@PassION_es Looking for something like below

 

binuacs_0-1666648420280.png

 

Labels
Top Solution Authors