Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Matching the Field Names Properly

StockMarket
8 - Asteroid

Hello

 

I have to create a BATCH MACRO in which I have to process multiple csv files which have got the following 10 fields inside them -

 

FileName,Ticker,OLD_Date,Time,Open,High,Low,Close,Volume,Open Interest

 

FileName
Ticker
OLD_Date
Time
Open
High
Low
Close
Volume
Open Interest


Now, ever once in a while, I end up with a csv file, which is either missing a particular required field, or have got some additional field which should not be there and sometimes there are cases of having spelling mistakes in the field names as well.

 

So I am trying to design a workflow which will WARN or ALERT me if there is any such error present within my csv files. I want to make sure that all the new data files that I process through my workflow, have the DITTO SAME data structure.

 

Here are the steps that I am trying to accomplish in this workflow -

 

First of all, convert all the Fields to Lower Case

 

Then edit the "open interest" field by removing the space and adding an underscore.

 

Then make sure that the csv file has got these 10 Fields inside it

AND in the same sequence,

AND with the same spelling for each field -

 

filename
ticker
old_date
time
open
high
low
close
volume
open_interest "note that I have removed the space and added an underscore instead"


Please suggest me which particular Alteryx tools will help me to do the above steps in most efficient manner.

 

I have also attached the Alteryx Worflow Package .yxzp file and the sample data file, so that you can easily open it up on your machine. I have used the latest version of Alteryx for creating this workflow. You may please have a look.

 

Thanks and Regards

 

 

Sample Data.png

 

 

 

Sample Workflow.png

 

 

Workfflow Files Attached.png

9 REPLIES 9
StockMarket
8 - Asteroid

Any ideas, anyone ? I thought that my question would be a fairly normal user case scenario, where someone would like to do the checking of the Data Fields, before going ahead with the various type of data cleaning and processing steps etc. So I was expecting quick response on this.

 

Maybe I have made some mistake in uploading the files, because of which you guys are not being able to download and run the batch macro on your system.

 

I have uploaded the Batch Macro on the community forum, for the very first time, so I am not totally sure, how to correctly upload it, so that other users can run it on their machines.

 

So I am requesting from the members who have helped me in the past, that can you please tell if you are able to run the batch macro uploaded by me ?

 

Thanks a lot and Best Regards.

 

@Qiu @AkimasaKajitani  @afv2688 

AkimasaKajitani
17 - Castor
17 - Castor

Hi @StockMarket 

 

You can use Message tool and you have to put the reference fields list into the workflow.

And I make sample workflow. This is prototype because there are a lot of things to consider to be perfect workflow.

AkimasaKajitani_4-1624081124704.png

 

Message tool configuration sample:

 

AkimasaKajitani_3-1624081083093.png

 

 

If you want to stop the workflow running when error occurs, you can use the "cancel Running Workflow on Error" option.

 

AkimasaKajitani_0-1624080254690.png

 

Error Message Sample:

AkimasaKajitani_5-1624081200335.png

 

StockMarket
8 - Asteroid

Thanks for replying @AkimasaKajitani 

 

I am trying to understand the various tools that you have proposed in the workflow. I will go through them one by one and then post my understanding on the thread itself. Since I am new to Alteryx therefor I have to go through each tool separately for understanding what it is doing, how is it doing it and so on.

 

In the mean time, can you please CONFIRM if you were able to download and run the BATCH MACRO that I have attached in the original post or not ? Have I uploaded it correctly? Can someone please run the batch macro on their system and confirm that it is running properly or not?

 

I did try to create the PACKAGE FILE, but I am not sure if this automatically includes all the required sample data files and workflow file and the macro file etc. or not. I do not know how exactly to upload the batch macro workflows correctly. Is uploading the package file sufficient or if I have to upload the files separately as well?

 

With Best Regards

AkimasaKajitani
17 - Castor
17 - Castor

1. About your Batch macro

 

Your uploaded packaged workflow is work well(of course the Batch macro work well).

 

2.About Package file doesn't all input files

 

Packaged workflow have only the path that is written in workflow directly.

For example, it does not contain the file of the directory tool setting or the file that if you use wildcard(*). 

I don't know the way of those files and I believe that there are not such a way.

 

 

 

AkimasaKajitani
17 - Castor
17 - Castor

Hi @StockMarket 

 

Let's back to my uploaded workflow.

 

Point is follows.

 - Field Info tool shows the schema of data stream.

 - If reference field list, you can compare between the list and output of field info tool.

 - You can use Message tool to output the error when you want.

 - The error conditions is that you can decide from comparing result.

 - You can control that if the error occurs, the workflow go or no go on Workflow option. 

 

And I have one correction. Count tool is not needed. Please delete it.

AkimasaKajitani_0-1624088620740.png

 

danilang
19 - Altair
19 - Altair

Hi @StockMarket 

 

As per your request, here's an explanation of how @AkimasaKajitani's field comparison works

 

danilang_0-1624126764556.png

 

The Field Info tool returns the list of fields along with their sizes and types.  The following ReordID tool just add a FieldOrder tool that increments for each row.  Combined they give you an output that looks like this

danilang_1-1624127163372.png

This is the same list that is reproduced in the Text Input called Master Field List.  These two lists are compared in the Join tool with the records matched on order, name, type and size.  All the records that match come out the J(oin) anchor of the Join tool.  The unmatched records from the L input come out L output.  Ditto for the R inputs and outputs.  

 

In a perfect run, all the fields on both sides will match.  If a field is missing, misspelled, or out of order, that's where the L and R outputs come into play.  Each of them is connected to a Count tool which simply counts the number of records.  The next tool is a Message tool, which writes a message in the results window if a condition is met.  In this case the message gets triggered if the count <> 0.  you'll notice that I use a Count tool, where @AkimasaKajitani used a Summarize in count mode.  The difference between the 2 is that the Count tool always returns a number even if that number is 0.  In the number of records going to a Summarize tool is 0, then the Summarize tool doesn't output any thing at all. 

 

To test this, change the name of one of the fields in your input file.  When you run it you'll see messages like this in your results window.  For this test I changed "FileName" to "File Name"      

danilang_2-1624127994914.png

Looking at the Brose tools connected to the L and R outputs of the Join tool will tell you what fields are at fault.

 

Dan

 

 

StockMarket
8 - Asteroid

Thank you so much for the great explanations @danilang It makes the whole process so clear. Otherwise I was having a hard time figuring out all of this on my own.

 

And I really appreciate your point regarding the Count Tool and Summarize Tool as well. I would also prefer Count Tool in my workflow for this purpose.

 

Once I run this workflow on my INPUT Files, then I can be totally sure that there are no problems with the data structure. All the fields etc. match perfectly with my older data, and I can import the new data without worries.

 

Thanks a lot

Best Regards

StockMarket
8 - Asteroid

The solution given above by @danilang named [example5v3(Differing Fields).yxmc] and @AkimasaKajitani  is working totally fine for the single CSV file, used in that workflow.

 

But I am having a real hard time in trying to convert this workflow in a way, such that I can run it as a batch macro on hundreds of csv files present in my folder.

 

Usually, I just make changes in these LAST TWO TOOLS from my workflow as given in the original post of this thread -

Batch Macro LAST PORTION.png

Where I simply use a FORMULA TOOL to rename the file according to the field "FileName" and then it just makes the workflow go into loop, again and again, till the last csv file of my folder is done. I really do not understand the logic behind it, but I simply add these tools to the end of any workflow to make them run repeatedly.

 

But in this current situation, of danilang workflow, we are not actually doing any kind of file renaming etc. of any type. So how do I make this workflow run repeatedly? What tool do I need to add to it, so that it can run repeatedly for all csv files in my folder? I have attached a few more csv files for this work.

 

Do I need to add the same portion as shown in the above snapshot to this workflow to make it work, or if there is any other alternative method as well, in which I will not have to keep on renaming the csv files according to the FieldNames ? Please suggest some methods for achieving this.

 

Thanks a lot.

Best Regards

 

AkimasaKajitani
17 - Castor
17 - Castor

Hi @StockMarket 

 

I think that my workflow works well as expected.

 

You can add the macro after directory tool.

AkimasaKajitani_0-1624540044103.png

Folder configuration :

AkimasaKajitani_3-1624540227930.png

 

I set the relative folder to File/Folder setting.

 

Directory tool setting :

AkimasaKajitani_2-1624540177486.png

 

AkimasaKajitani_4-1624540259076.png

 

 

AkimasaKajitani_1-1624540136147.png

 

If there are anything that you don't expect, please let me know.

Labels