Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Nested XML Parser

vishwakar
8 - Asteroid

Hi

 

I am trying to parse the below xml file , I need the relation name and table name from the below , I have tried using XML Parse but could not get the desired results, Any help is appreciated.

 

 

 

<relation join='inner' type='join'>
          <clause type='join'>
            <expression op='='>
              <expression op='[account].[id]' />
              <expression op='[contact].[accountid]' />
            </expression>
          </clause>
          <relation join='inner' type='join'>
            <clause type='join'>
              <expression op='='>
                <expression op='[opportunity].[accountid]' />
                <expression op='[account].[id]' />
              </expression>
            </clause>
            <relation join='inner' type='join'>
              <clause type='join'>
                <expression op='='>
                  <expression op='[pricebookentry].[product2id]' />
                  <expression op='[tproduct2].[id]' />
                </expression>
              </clause>
              <relation join='inner' type='join'>
                <clause type='join'>
                  <expression op='='>
                    <expression op='[opportunitylineitem].[pricebookentryid]' />
                    <expression op='[pricebookentry].[id]' />
                  </expression>
                </clause>
                <relation join='inner' type='join'>
                  <clause type='join'>
                    <expression op='='>
                      <expression op='[opportunity].[id]' />
                      <expression op='[opportunitylineitem].[opportunityid]' />
                    </expression>
                  </clause>
                  <relation name='opportunity' table='[sfdc].[opportunity]' type='table' />
                  <relation name='opportunitylineitem' table='[sfdc].[opportunitylineitem]' type='table' />
                </relation>
                <relation name='pricebookentry' table='[sfdc].[pricebookentry]' type='table' />
              </relation>
              <relation name='tproduct2' table='[sfdc].[tproduct2]' type='table' />
            </relation>
            <relation name='account' table='[sfdc].[account]' type='table' />
          </relation>
          <relation name='contact' table='[sfdc].[contact]' type='table' />
        </relation>

 Thanks

Karthik

11 REPLIES 11
KaneG
Alteryx Alumni (Retired)

Hi @vishwakar,

 

The Issue that your having is that each relation name is nested so the decision that you need to make is whether you are happy to just extract the information, or whether you need to parse the document.

 

If you need to parse the document, then an iterative macro will be best which basically runs an XML Parse tool on relation, until it can't fine relation name anymore.

 

If you are happy to just extract the data, then bring in the doc as csv with no delimiter. From there you have 2 options:

- REGEX: ^<relation name=(.*?) table=(.*?) type=(.*?) />$

- Filter, Replace, Text2Col (Split to rows to put table on another row), Text2Col (split on the equals), Cross Tab.

 

The attached file shows both methods to extract the data.

 

Kane

vishwakar
8 - Asteroid

Hi @KaneG

 

Yeah the below methods doesnt fit in to the flow as I cant convert it to csv file , Can you help me an example of the Iterative macro for the below using XML Parser.

 

Thanks for your help.

 

Thanks

Karthik

KaneG
Alteryx Alumni (Retired)
Hi Karthik,

You don't need to convert the file, just tell Alteryx that it is a csv with delimiter = \0.

Kane
vishwakar
8 - Asteroid

Thanks Kane , I need to bring it as XML File as I am also parsing other elements it it. I see that the delimeter option is when we specify alteryx that format is csv and give it with a delimeter option which might not work as ther other root and child elements might break.

 

Thanks

Vishwa

KaneG
Alteryx Alumni (Retired)

Hi @vishwakar,

 

That makes sense, looks like an iterative Macro would be the best way then. I'll try to create something if I get a chance in the next couple of days....

vishwakar
8 - Asteroid

Thanks @KaneG

 

I am looking forward for your example. Thanks for your time and definately helps me in completing the workflow to get the desired result.

 

Thanks

Karthik

vishwakar
8 - Asteroid

Community Any Help is appreciated and wanted to know if Alteryx is the right option for doing this.

 

Thanks for your time and help in advance.

KaneG
Alteryx Alumni (Retired)

Hi @vishwakar,

 

I finally had a chance to take another look at this, it is not a full solution but I quickly put together an iterative Macro that removes the nesting in a pretty rough fashion. It should give you an idea of how to proceed. 

 

Note: This Macro still uses some 'cheat' methods to align fields etc, that would be best made dynamic in production. In this kind of situation, the actual field names that get parsed are normally the same and so don't need to be made dynamic. 

 

Kane

vishwakar
8 - Asteroid

Hi KaneG

 

Thanks for your time and help, I have checked your macro and it definetly helps. But doesnt get the output in the desired way I was expecting

 

I just want to go and see if the Regex metho of just extracting the data will work out . I know you have extracted the relation name ,table and type earlier but just to add , I need the datasource name and caption added 

 

  <datasources>
    <datasource caption='opportunity (sfdc.opportunity)+ (warehouse)' inline='true' name='oracle.42178.459579328700' version='8.3'>
      <connection class='oracle' dbname='warehouse' odbc-native-protocol='yes' port='5432' server='testserver.com' server-oauth='' username='helloiam' workgroup-auth-mode='prompt'>
        <relation join='inner' type='join'>
          <clause type='join'>
            <expression op='='>
              <expression op='[account].[id]' />
              <expression op='[contact].[accountid]' />
            </expression>
          </clause>
          <relation join='inner' type='join'>
            <clause type='join'>
              <expression op='='>
                <expression op='[pricebookentry].[product2id]' />
                <expression op='[tproduct2].[id]' />
              </expression>
            </clause>
            <relation join='inner' type='join'>
              <clause type='join'>
                <expression op='='>
                  <expression op='[opportunitylineitem].[pricebookentryid]' />
                  <expression op='[pricebookentry].[id]' />
                </expression>
              </clause>
              <relation join='inner' type='join'>
                <clause type='join'>
                  <expression op='='>
                    <expression op='[opportunity].[accountid]' />
                    <expression op='[account].[id]' />
                  </expression>
                </clause>
                <relation join='inner' type='join'>
                  <clause type='join'>
                    <expression op='='>
                      <expression op='[opportunity].[opportuntiyid]' />
                      <expression op='[opportunitylineitem].[opportunityid]' />
                    </expression>
                  </clause>
                  <relation name='opportunity' table='[sfdc].[opportunity]' type='table' />
                  <relation name='opportunitylineitem' table='[sfdc].[opportunitylineitem]' type='table' />
                </relation>
                <relation name='account' table='[sfdc].[account]' type='table' />
              </relation>
              <relation name='pricebookentry' table='[sfdc].[pricebookentry]' type='table' />
            </relation>
            <relation name='tproduct2' table='[sfdc].[tproduct2]' type='table' />
          </relation>
          <relation name='contact' table='[sfdc].[contact]' type='table' />
        </relation>
      </connection>
    </datasource>
    <datasources>

 

Expecting Output

 

I am ok with any way of bringing the file if i am able to achieve the below output 

 

Data source captionDatasource NameRelation Nametabletype
opportunity (sfdc.opportunity)+ (warehouse)oracle.42178.459579328700Opportunity[sfdc].[opportunity]table
opportunity (sfdc.opportunity)+ (warehouse)oracle.42178.459579328700opportunitylineitem[sfdc].[opportunitylineitem]table
opportunity (sfdc.opportunity)+ (warehouse)oracle.42178.459579328700account[sfdc].[account]table
opportunity (sfdc.opportunity)+ (warehouse)oracle.42178.459579328700pricebookentry[sfdc].[pricebookentry]table
opportunity (sfdc.opportunity)+ (warehouse)oracle.42178.459579328700tproduct2[sfdc].[tproduct2]table
opportunity (sfdc.opportunity)+ (warehouse)oracle.42178.459579328700contact[sfdc].[contact]table

 

Again Thanks alot for time and support , Cheers :)

 

Thanks

Karthik

Labels