Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

comparing column headers and notes in header cells to see if two templates differ

grandchallenges
5 - Atom

Hi,

 

I am pretty new at this and have only done 2 workflows.  I now want to compare two Excel templates and find differences.  I feel OK about checking whether the data values in the columns match but I don't know how to compare the column names and the instructions/rules that are in an Excel Note for each column.  If you have ideas on how to do that, can you let me know?

Many thanks,

Linda 

5 REPLIES 5
AngelosPachis
16 - Nebula
  1. Hi Linda ( @grandchallenges )

 

To compare the column headers between the two files, you can use a tool called "Field Info" that will return the different column headers of each input as records. You can use that tool to both inputs to get two lists of the different column headers in each file; then use a Join tool to see if the column names match. Those that don't will fall out of the L or R output anchors of the join tool.

 

AngelosPachis_0-1620760485997.png

 

TheOC
15 - Aurora
15 - Aurora

hey @grandchallenges!

I cant improve on @AngelosPachis 's answer, but just to give you a little more information as you say you're fairly new to Alteryx, there is a documentation piece on the field info tool:
https://help.alteryx.com/current/designer/field-info-tool

And there are also sample workflows in Alteryx for this tool, to help you learn them:

TheOC_0-1620829592943.png

 

Hitting example will bring up a sample workflow with this tool being used.


Hope this helps!
TheOC

 


Bulien
danilang
19 - Altair
19 - Altair

Hi @grandchallenges 

 

By default, the Input tool will only read data from a workbook and not formatting or annotations.  There is a way to get around this though.  .xlsx files are just zip files with a .xlsx extension.  By using an Input tool to open the .xlsx as a zip file and then parsing the various xml files within, you can find out all sorts of secrets.

 

danilang_0-1620831304697.png

 

In your particular case, comments are saved in the xml/comments1.xml file inside the archive.  The workflow loads this file, gets the all the <comment> xml nodes and parses the text of the comment and then finds the column number from the cell reference "A1", "AB42".  The bottom stream opens the file normally, uses a field info tool, to get the column names and assign a sequential column position.  Join this to output from the top stream on column position and you get something like this

danilang_2-1620831813555.png

 

 

 

danilang_1-1620831734927.png

 

Dan

 

  

JokeFun
8 - Asteroid

I have a file with more than 10K rows where the field contains text string, as well as some rows with formulas like =-@ abcd.

The purpose is to identify rows with such formulas and replace by abcd (removing =-@).

 

However, I was seeing this error message.

Error: XML Parse (24): input ended before all started tags were ended; last tag started is 'row' at Line:1 and Column:100001

 

Then I set the field length to be 10000000000 something like that, even larger. Then it works.

 

 

caltang
17 - Castor

Very helpful @danilang !!

Labels