Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Remove special characters - different excel files except slash (/) in dates

PassION_es
10 - Fireball

Hello Alteryx Team,

 

I have a workflow that converts excel file to .dat extension format using csv.  The workflow already generate an output in .dat format however, it does not have yet cleanse the output. I need some help in the cleanse tools or any tool that will remove all special characters except to dates with /slash (date format is 01/01/2000) and except for period inside the amount (ex. 500000.25).

 

I only use cleanse tool when column headers are there.  However, in this case, the excel file has no header that is permanent and the first row contains data that is dynamic, changes every month. 

 

The question is, are there any tools other than cleanse tool that eliminates all special characters captured in excel file whatever data or number of columns/fields the excel file has as long as the slash(/) in all dates are retained? Also, is it possible to use one workflow to various excel files with different number of columns or different data set? We have different files that need to be converted from xlsx to .dat extension file and all these files may have different columns and dates are located in one or 2 of these columns.

 

Attached is the dummy input (excel) and workflow for your reference.  Thank you in advance.

 

 

33 REPLIES 33
PassION_es
10 - Fireball

hello @binuacs , Thank you. This is what I am looking for. It seems that I will be using only one workflow to run and convert different file with dynamic  number of columns. I only have a few questions to ask from the workflow and the output:

 

First, may you interpret the reg-ex or if condition you put in the formula so I may understood them correctly.  I added some special characters in the 2nd formula.  For the first formula, not sure why only the ! and * is trimmed.

PassION_es_0-1666669277293.png

 

 

In the output part, the result is correct. It converted from excel file to .dat extension format and it removes the special characters indicated in the formula. However the only one I noticed is that all blank fields/cells in the first row in the output has F10,F11,F12,etc.  This should be in blank cells same with the original file.  All excel input files converted to .dat file format have no column headers on it so we will assume that there are blank cells in the first line items and with data on the next line items.  Please see snapshot of the output:

 

PassION_es_2-1666670023875.png

 

 

Thanks.

 

 

 

 

binuacs
21 - Polaris

@PassION_es you can ignore the first Replace formula, i already added the same in the second ReplaceChar() formula, also in the output tool can you uncheck the option first name contains fields names

 

binuacs_0-1666688227981.png

 

below is my output

binuacs_1-1666688265145.png

 

 

PassION_es
10 - Fireball

Hello @binuacs , The output did not captured the first line item when I untick the "First row contains field names" in the output configuration. Please see the attached workflow for your checking .

 

PassION_es_0-1666692029226.png

 

binuacs
21 - Polaris

@PassION_es Can you check if your input file tool has the below option checked?

 

binuacs_0-1666693845403.png

 

PassION_es
10 - Fireball

Hello @binuacs , Yes it has.  The only difference with our input tool is that it is full path and the sheet is dynamic (you can refer to the yxmd I attached today).

 

PassION_es_0-1666694463878.pngPassION_es_1-1666694542191.png

 

binuacs
21 - Polaris

@PassION_es Your Dynamic input tool output the records with the field name F1, f2 etc? 

PassION_es
10 - Fireball

yes it is:

PassION_es_0-1666695207438.png

 

binuacs
21 - Polaris

@PassION_es when you check the input tool First row contains data then the output column names of the dynamic input tool should be shown like below, starting F1, f2 etc but seeing your output heading it is taking first row as heading and assigning F12, F13 for the blank columns. Attaching my workflow 

 

binuacs_0-1666695730325.png

 

binuacs
21 - Polaris

@PassION_es you need to make changes in the Dynamic Input tool like below

 

binuacs_0-1666706802183.png

 

PassION_es
10 - Fireball

Hi @binuacs,   I forgot that there is need to tick mark when clicking the edit of dynamic input tool.  And now it solves the issue. 

Apologies, I just compared the current manual output of .dat file and the Alteryx output, and I only just noticed that if each of the dates have formulas, it will never be converted as dates:

 

Alteryx Output:

PassION_es_0-1666750787343.png

 

Current/ Manual - Should be Output:

PassION_es_1-1666750891807.png

 

 

First question, is there a tool that will remove the formulas in the current input file before removing the special characters.  Please be informed that I already added in the formula the regex-'\d+\/\d{4}' for dates with format 06/2022.

 

Second question is, if you notice in the manual output, the commas before the highlighted date is counted as 4 (commas), but the generated output is only 2 commas, Is this has something to do with the code page used? I switch any code pages and tested  the output (CSV) of each from ISO 8859-1 Latin 1 but same number (2) of commas is generated.  Although, I haven't confirmed with the process owners if this is ok and accepted by the system for uploading.  Just  curious if you know the answer why its only generating two commas before dates.

 

3rd question is, the last line item is the total amount (this is with formula also).. In the Alteryx, it resulted to more than two decimal points but when generating manually, it only resulted to up to two decimals only.

 

Manual output: Sum total (last line item):

PassION_es_2-1666751615972.png

Alteryx output:

PassION_es_3-1666751704660.png

 

Thank you so much .

 

Labels