Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Using Parsing and Multi-Field to format document

davidlocke
7 - Meteor

I have a file that comes out of my tax software that need to break out into columns and apply some kind of multi-field condition to.

 

The basic flow is data is stacked in the following order:

  1. IRS Form Number 1120 Pg. 1 Line number - Line Description
  2. Software's TRC No - TRC Description
  3. Software's TCC No - TCC Descripion
  4. Our Company's GL A/C No - G/L Description

 

Some inherent data flow considerations are:

  • Form Line Numbers are alphanumeric and will include letters and numbers and will not be the same number of characters
  • Form Line Descriptions while generally alpha numeric, can contain special characters (":", "&", "%", and "/" being the most common)
  • TRC No are /d{2}-/d{3}
  • TRC Descriptions while generally alpha numeric, can contain special characters (":", "&", "%", and "/" being the most common)
  • TCC No are /d{3}
  • TCC Descriptions while generally alpha numeric, can contain special characters (":", "&", "%", and "/" being the most common)
  • G/L AC No.s are /w{6}
  • G/L Descriptions are Descriptions while generally alpha numeric, can contain special characters (":", "&", "%", and "/" being the most common)
  • Not every Line Number has a TRC Number
    • If it does not have a TRC Number, it will not have a TCC Number, or G/L Number
  • Not every TRC Number has a TCC Number
    • If it does not have a TCC Number, it will not have a G/L Number
  • Not every TCC Number has a G/L Number
  • If a Line Number has a TRC Number, it can have 1 or more TRC Numbers
  • If a TRC Number has a TCC Number, it can have 1 or more TCC Numbers
  • If a TCC Number has a G/L No., it can have 1 or more G/L Numbers
  • Each G/L No can have ONLY ONE Line No, TRC No., and TCC No.

 

End Result would be to organize the G/L accounts by Line No and Description; TRC No. and Description; TCC No. and Description. Then I would concatenate the TRC No and TCC No. in a separate column.

GL Acct No.GL Desc1120 Pg. 1 Ln No.1120 Pg. 1 Ln DescTRC/TCC ComboTRC No.TRC DescTCC No.TCC Desc
400000Sales1aGross Receipts or Sales30-100 01030-100Gross Receipts or Sales010Gross Receipts/Sales
400001Product Sales-Manual1aGross Receipts or Sales30-100 01030-101Gross Receipts or Sales010Gross Receipts/Sales

 

Ideally, I would be able to have a filter near the end where True would be !Null G/L and False I still would be able to see the Line No/TRC/TCCs that do not have an account, but that still has the TCC on the same line with its TRC and Line No on the same row on the F.

 

I have tried working with Parse and for the life of me I can't get the parse to work.  I can text to columns, but because each is slightly different it creates more columns than I want and separates the TRC Number as it contains I dash. 

6 REPLIES 6
grossal
15 - Aurora
15 - Aurora

Hi @davidlocke,

(@others: Feel free to use my parsing as a starting point - I am not sure if I can figure out the whole thing)

 

I won't have a full solution for you, but a first parsing attempt:

 

grossal_0-1588278145543.png

Result looks like this:

grossal_1-1588278176017.png

 

Is the parsing right or do we need to adapt something? Workflow is attached.

 

Best

Alex

davidlocke
7 - Meteor

Hey @grossal,

 

That didn't quite work for me.  However, I was able to find my parsing error.  I was using / instead of \ on my metacharacters.

 

So I have my data parsed.  Now I need an answer to turn that into my final product.  Maybe, Cross-tab/multi-field? 

 

@TonyA you helped me on something on the same general lines earlier, do you have an idea?

TonyA
Alteryx Alumni (Retired)

I'll take a look at this tonight. It won't be a simple crosstab -- we'll need to fill down the line numbers, TRC's and TCC's according to a set of rules.

 

Just to verify, is it correct to say that the only possibilities for output records is:

  • Line Number only
  • Line Number and TRC
  • Line Number, TRC, and TCC
  • Line Number, TRC, TCC, and GL?

Questions:

  • How do I identify line numbers? If a line number always includes alphabetic characters, then I can distinguish it from TRC and TCC codes, but if GL accounts are and line numbers are both alphanumeric and line numbers can have any number of characters, how do I distinguish between 6 character line number and a GL code?
  • If I have a new TCC number, I assume the line number and TRC carry down. If I get a new TRC and it is not followed by a TCC, is the TCC now null or does the previous TCC carry down? If the TCC is null, then I assume the TRC must be followed by either a TCC or a line number.
davidlocke
7 - Meteor

@TonyA,

 

Your whole reply didn't show up in the community for some reason. So I am pasting here with my answers in bold:

Just to verify, is it correct to say that the only possibilities for output records is:

  1. Line Number only Line Number and TRC Line Number, TRC, and TCC Line Number, TRC, TCC, and GL?

 

I would expect the only outputs to be:

 1120 Pg. 1 Line No.1120 Pg. 1 Line DescTRC No.TRC DescTCC No.TCC DescGL AC No.GL AC Desc
Result 11120 Pg. 1 Line No.1120 Pg. 1 Line DescNullNullNullNullNullNull
Result 21121 Pg. 1 Line No.1121 Pg. 1 Line DescTRC No.TRC DescNullNullNullNull
Result 31122 Pg. 1 Line No.1122 Pg. 1 Line DescTRC No.TRC DescTCC No.TCC DescNullNull
Result 41123 Pg. 1 Line No.1123 Pg. 1 Line DescTRC No.TRC DescTCC No.TCC DescGL AC No.GL AC Desc

 

 

Questions:

  • How do I identify line numbers? If a line number always includes alphabetic characters, then I can distinguish it from TRC and TCC codes, but if GL accounts and line numbers are both alphanumeric and line numbers can have any number of characters, how do I distinguish between 6 character line number and a GL code?

Line Numbers are alpha numeric.  They are no more than 3 characters, which would be either just a one to two digit number possibly followed by a letter (subline) (just so I can use this on other forms, range from (1a-1z)-(99a-99z))  To get a better understanding you can look at the form in question f1120, which I have attached.  Basically our tax software maps GL ACs to a form line number using a TRC/TCC code combination.  If we didn't have any activity this year, but had activity in a prior year on a line there would be no GLs, but a TRC and TCC.  If we have never had any activity on a line there would be only a TRC (we create the TCC, the tax software creates the TRC).  If the line cannot be mapped to (lines on the tax form that total or subtotal) then there would not be a GL, a TRC, or a TCC.

 
 
  • If I have a new TCC number, I assume the line number and TRC carry down.

Correct.  See example below:

 

Excerpt of Alteryx Data at Present        
ID NoID Desc NOTE NOT IN DATA      
4 Additional section 263A costsLine Number      
40-140COGS: Additional Sec. 263A CostsTRC Number      
010Additional Section 263A CostsTCC Number      
550103Additional 263A Costs GL      
5 Other costs Line Number      
40-150COGS: Other Costs TRC Number      
020Advertising TCC Number      
557130Other Advertising GL      
025Allocation & Settlement - LedgeTCC Number      
558990Settlement Account GL      
559991CC Alloc In/Out GL      
559992Order Settlement GL      
559999FI-CO Recon. GL      
          
Alteryx Expected Result        
GL Acct No.GL Desc1120 Pg. 1 Ln No.1120 Pg. 1 Ln DescTRC/TCC ComboTRC No.TRC DescTCC No.TCC Desc 
550103Additional 263A Costs4 Additional section 263A costs40-140 01040-140COGS: Additional Sec. 263A Costs010Additional Section 263A Costs
557130Other Advertising5 Other costs40-150 02040-150COGS: Other Costs020Advertising 
558990Settlement Account5 Other costs40-150 02540-150COGS: Other Costs025Allocation & Settlement - Ledge
559991CC Alloc In/Out5 Other costs40-150 02540-150COGS: Other Costs025Allocation & Settlement - Ledge
559992Order Settlement5 Other costs40-150 02540-150COGS: Other Costs025Allocation & Settlement - Ledge
559999FI-CO Recon.5 Other costs40-150 02540-150COGS: Other Costs025Allocation & Settlement - Ledge
  • If I get a new TRC and it is not followed by a TCC, is the TCC now null or does the previous TCC carry down? If the TCC is null, then I assume the TRC must be followed by either a TCC or a line number.

If a Line Number is not followed by a TRC, TRC, TCC, GL would be Null

If a new TRC is not followed by a TCC, TCC & GL would be Null

If a TCC is not followed by GL, GL would be Null

IMPORTANT

If a GL is followed by TCC and NOT a TRC then the TRC should be the last TRC (see example above for TRC 40-150)

 

Let me know if you have additional questions.

 

 

 

TonyA
Alteryx Alumni (Retired)

Take a look at this. You'll need to check my logic carefully, I did this kind of quickly. Especially the last three tools where I decided when to keep rows that didn't have GL values.

I focused on getting everything into rows and getting rid of unneeded rows. So there is a little cleanup left to do including creating the concatenated column.

 

Also, since I did do this quickly you might be able to tighten it up.

 

Edit:

One more thing. I did not do any error checking. For example, I'm only checking to see if a change in line number is followed by a change in TRC and keeping the line if it isn't. I'm not checking to see if a change in line number is followed by, for example a new GL, which would be an error.

I just looked through your example data and that does happen in one place at the end of the data set, so you might want to account for that. It would just require some more complex logic in the last Multi-Row Formula tool.

 

Edit (2): I found an error that I just fixed. If you downloaded before 8am Pacific on May 1, please download again.

davidlocke
7 - Meteor

I compared the result against what we had for 2018 and it all tied out.  Thanks for your help.

Labels