Alteryx Designer Desktop Discussions

vishwakar · ‎07-03-2016

Hi

I am trying to parse the below xml file , I need the relation name and table name from the below , I have tried using XML Parse but could not get the desired results, Any help is appreciated.

<relation join='inner' type='join'>
          <clause type='join'>
            <expression op='='>
              <expression op='[account].[id]' />
              <expression op='[contact].[accountid]' />
            </expression>
          </clause>
          <relation join='inner' type='join'>
            <clause type='join'>
              <expression op='='>
                <expression op='[opportunity].[accountid]' />
                <expression op='[account].[id]' />
              </expression>
            </clause>
            <relation join='inner' type='join'>
              <clause type='join'>
                <expression op='='>
                  <expression op='[pricebookentry].[product2id]' />
                  <expression op='[tproduct2].[id]' />
                </expression>
              </clause>
              <relation join='inner' type='join'>
                <clause type='join'>
                  <expression op='='>
                    <expression op='[opportunitylineitem].[pricebookentryid]' />
                    <expression op='[pricebookentry].[id]' />
                  </expression>
                </clause>
                <relation join='inner' type='join'>
                  <clause type='join'>
                    <expression op='='>
                      <expression op='[opportunity].[id]' />
                      <expression op='[opportunitylineitem].[opportunityid]' />
                    </expression>
                  </clause>
                  <relation name='opportunity' table='[sfdc].[opportunity]' type='table' />
                  <relation name='opportunitylineitem' table='[sfdc].[opportunitylineitem]' type='table' />
                </relation>
                <relation name='pricebookentry' table='[sfdc].[pricebookentry]' type='table' />
              </relation>
              <relation name='tproduct2' table='[sfdc].[tproduct2]' type='table' />
            </relation>
            <relation name='account' table='[sfdc].[account]' type='table' />
          </relation>
          <relation name='contact' table='[sfdc].[contact]' type='table' />
        </relation>

Thanks

Karthik

KaneG · ‎07-03-2016

Hi @vishwakar,

The Issue that your having is that each relation name is nested so the decision that you need to make is whether you are happy to just extract the information, or whether you need to parse the document.

If you need to parse the document, then an iterative macro will be best which basically runs an XML Parse tool on relation, until it can't fine relation name anymore.

If you are happy to just extract the data, then bring in the doc as csv with no delimiter. From there you have 2 options:

- REGEX: ^<relation name=(.*?) table=(.*?) type=(.*?) />$

- Filter, Replace, Text2Col (Split to rows to put table on another row), Text2Col (split on the equals), Cross Tab.

The attached file shows both methods to extract the data.

Kane

vishwakar · ‎07-04-2016

Hi @KaneG

Yeah the below methods doesnt fit in to the flow as I cant convert it to csv file , Can you help me an example of the Iterative macro for the below using XML Parser.

Thanks for your help.

Thanks

Karthik

KaneG · ‎07-04-2016

Hi Karthik,

You don't need to convert the file, just tell Alteryx that it is a csv with delimiter = \0.

Kane

vishwakar · ‎07-04-2016

Thanks Kane , I need to bring it as XML File as I am also parsing other elements it it. I see that the delimeter option is when we specify alteryx that format is csv and give it with a delimeter option which might not work as ther other root and child elements might break.

Thanks

Vishwa

KaneG · ‎07-05-2016

Hi @vishwakar,

That makes sense, looks like an iterative Macro would be the best way then. I'll try to create something if I get a chance in the next couple of days....

vishwakar · ‎07-06-2016

Thanks @KaneG

I am looking forward for your example. Thanks for your time and definately helps me in completing the workflow to get the desired result.

Thanks

Karthik

vishwakar · ‎07-09-2016

Community Any Help is appreciated and wanted to know if Alteryx is the right option for doing this.

Thanks for your time and help in advance.

KaneG · ‎07-14-2016

Hi @vishwakar,

I finally had a chance to take another look at this, it is not a full solution but I quickly put together an iterative Macro that removes the nesting in a pretty rough fashion. It should give you an idea of how to proceed.

Note: This Macro still uses some 'cheat' methods to align fields etc, that would be best made dynamic in production. In this kind of situation, the actual field names that get parsed are normally the same and so don't need to be made dynamic.

Kane

vishwakar · ‎07-19-2016

Hi KaneG

Thanks for your time and help, I have checked your macro and it definetly helps. But doesnt get the output in the desired way I was expecting

I just want to go and see if the Regex metho of just extracting the data will work out . I know you have extracted the relation name ,table and type earlier but just to add , I need the datasource name and caption added

  <datasources>
    <datasource caption='opportunity (sfdc.opportunity)+ (warehouse)' inline='true' name='oracle.42178.459579328700' version='8.3'>
      <connection class='oracle' dbname='warehouse' odbc-native-protocol='yes' port='5432' server='testserver.com' server-oauth='' username='helloiam' workgroup-auth-mode='prompt'>
        <relation join='inner' type='join'>
          <clause type='join'>
            <expression op='='>
              <expression op='[account].[id]' />
              <expression op='[contact].[accountid]' />
            </expression>
          </clause>
          <relation join='inner' type='join'>
            <clause type='join'>
              <expression op='='>
                <expression op='[pricebookentry].[product2id]' />
                <expression op='[tproduct2].[id]' />
              </expression>
            </clause>
            <relation join='inner' type='join'>
              <clause type='join'>
                <expression op='='>
                  <expression op='[opportunitylineitem].[pricebookentryid]' />
                  <expression op='[pricebookentry].[id]' />
                </expression>
              </clause>
              <relation join='inner' type='join'>
                <clause type='join'>
                  <expression op='='>
                    <expression op='[opportunity].[accountid]' />
                    <expression op='[account].[id]' />
                  </expression>
                </clause>
                <relation join='inner' type='join'>
                  <clause type='join'>
                    <expression op='='>
                      <expression op='[opportunity].[opportuntiyid]' />
                      <expression op='[opportunitylineitem].[opportunityid]' />
                    </expression>
                  </clause>
                  <relation name='opportunity' table='[sfdc].[opportunity]' type='table' />
                  <relation name='opportunitylineitem' table='[sfdc].[opportunitylineitem]' type='table' />
                </relation>
                <relation name='account' table='[sfdc].[account]' type='table' />
              </relation>
              <relation name='pricebookentry' table='[sfdc].[pricebookentry]' type='table' />
            </relation>
            <relation name='tproduct2' table='[sfdc].[tproduct2]' type='table' />
          </relation>
          <relation name='contact' table='[sfdc].[contact]' type='table' />
        </relation>
      </connection>
    </datasource>
    <datasources>

Expecting Output

I am ok with any way of bringing the file if i am able to achieve the below output

Data source caption	Datasource Name	Relation Name	table	type
opportunity (sfdc.opportunity)+ (warehouse)	oracle.42178.459579328700	Opportunity	[sfdc].[opportunity]	table
opportunity (sfdc.opportunity)+ (warehouse)	oracle.42178.459579328700	opportunitylineitem	[sfdc].[opportunitylineitem]	table
opportunity (sfdc.opportunity)+ (warehouse)	oracle.42178.459579328700	account	[sfdc].[account]	table
opportunity (sfdc.opportunity)+ (warehouse)	oracle.42178.459579328700	pricebookentry	[sfdc].[pricebookentry]	table
opportunity (sfdc.opportunity)+ (warehouse)	oracle.42178.459579328700	tproduct2	[sfdc].[tproduct2]	table
opportunity (sfdc.opportunity)+ (warehouse)	oracle.42178.459579328700	contact	[sfdc].[contact]	table

Again Thanks alot for time and support , Cheers :)

Thanks

Karthik

Alteryx Designer Desktop Discussions

Nested XML Parser