Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Updating RegEx tool in Bach Macro

Citauma
7 - Meteor

Hi there,

 

I have an issue that will get resolved with a positive answer to this question... please help!

 

I am currently trying to use RegEx syntax and the RegEx tool to parse an XML file (Yes, I know this isn't advised but here we are lol)

<Layer1>

             <Table>

             ...

             </Table>

 

             <Table>

             ...

             </Table>

...

 

             <Table>

             ...

             </Table>

<Layer1>

 

I have several XML files I need to extract information from. Each XML file has the pattern above but a varying number of tags I require. I use

REGEX_CountMatches([LAYER1_OuterXML], "<TABLE") to determine how many "<TABLE"s will be in the XML, and dynamical create a Regular Expression to insert into a batch macro to update the RegEx Tool (which is set to Output Method: Parse)

 

i.e.

if there are 2 <TABLE>;

(<TABLE.*?>.*?</TABLE.*?>).*?(<TABLE.*?>.*?</TABLE.*?>)

 

if there are 3 <TABLE>

(<TABLE.*?>.*?</TABLE.*?>).*?(<TABLE.*?>.*?</TABLE.*?>).*?(<TABLE.*?>.*?</TABLE.*?>) etc

 

The problem I am getting is that, each time you enclose part of the regular expression in a "( )" , a new output field is correctly created. But when you pass the newly created regular expression into the RegEx tool, you get an error saying that the "hard coded" number of Output Fields (in the RegEx tool located in the Batch Macro) is different to the number of Output Fields I am creating with each update to the regular expression

2 REPLIES 2
KaneG
Alteryx Alumni (Retired)

Hi @Citauma,

 

You're not going to get past that error message with what you want to do... time to design it a different way. Basically, as you have a dynamic number of tables it's best to have them in a known number of columns rather than trying to adhere to an unknown number. It generally makes the data easier to deal with.

 

In HTML, often there is an unknown number of tables and the way that I do that is with

REGEX Tokenise: Split to Rows - <table(.*?)</table> (You can also have <table(.*?)</table.*?> or even (<TABLE.*?>.*?</TABLE.*?>), but you can only have one Marked Group)

 

You can then parse from there.

Citauma
7 - Meteor

Thanks a lot!

 

I had 2 or 3 batch macros to make this as dynamic as possible. This prevents that requirement.

 

Thanks again @

Labels