Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Change field name based on value in filename?

hellyars
13 - Pulsar

Okay, I have an annual file. 

 

The format of the filename is "fy##-name.abc"

 

Within each file year's are identified as Prior Year, Current Year, Year One, Year Two, Year Three, Year Four, Year Five.  

 

How can I build a formula that dynamically uses the two ## digits in the file name to convert Prior Year to ## - 2, Current Year to ## -1, Year One to ##, Year Two to ## + 1, etc.?   

18 REPLIES 18
tcroberts
12 - Quasar

I think your best bet here is to use information from the Directory Tool to grab the year.

 

I've attached a sample workflow that uses this method:

 

Spoiler
usefilenameasvalue.PNG

What I've done is use Directory Tool -> Dynamic input to read my data in (you'll likely have to make the directory tool look for more specific files, or add a filtering step before the Dynamic input/parse tool.

 

I then create a target column, with the name current year, so I can match to the transposed columns of the input file. Once I make this match, I use the multi-row formula to rename them. I've created a duplicate of the Name column here, but I don't think it's necessary in general, I just wanted to show that the numbers were generated correctly.

 

For this it is important that the "Values for Rows that don't Exist" is set to NULL, since I use a Null check to set Prior Year correctly.

 

After removing unnecessary information, I CrossTab back into the original format.

 

Let me know if you have any questions, or this workflow doesn't work for you for whatever reason.

 

Cheers!

hellyars
13 - Pulsar

I can kind of follow what you are trying to do.   But, I think there might be another challenge.  The PY, CY, Yr1... fields are not available at source input.  I have to parse to extract them.

 

Once parsed, the data looks like this.

 

 PYCYYr1Yr2Yr3Yr4
fy15_filename101520253035
fy16_filename101520253035
fy17_filename101520253035
fy18_filename101520253035

 

Individually, I simply use the select tool to rename PY, CY, Yr1... to the correct year.   But, I am trying to avoid building a long lengthy formula that has to be updated every year. 

 

 

 14151617181920212223

fy15_filename

10152025303540   
fy16_filename 10152025303540  
fy17_filename  10152025303540 
fy18_filename   10152025303540
tcroberts
12 - Quasar

I think you may still be able to do something like what I've done above, you'll just need to use the GroupBy feature and group by Filename.

 

Instead of using the find/replace Tool, I'm just joining in the results of my Regex before the Transpose so I can group by year, then my formula just creates that same column that the previous find replace did, replacing Current Year with the year, and setting all the others to the Name column. The multi-row tool is unchanged, and the CrossTab just needs to be set to GroupBy the RegExOut field as well.

 

usefilenameasvalue.PNG

 

I've attached an updated workflow, let me know if this helps.

 

hellyars
13 - Pulsar

Sorry, don't mean to be retarded.  The example has no input data.  So, I can't see what you are joining.  And, the real data has dozens of fields.  So, its throwing me a curve ball. 

 

 

tcroberts
12 - Quasar

I've just created some files with your naming scheme filled with 1s. as the data.

 

If you can add some mock data of what an input to this process would look like, I could put together a more realistic example workflow.

 

When you say it has dozens of fields, are you saying that there is like Year+12, or you're saying that there are multiple fields that need to be transformed like this?

 

i.e. you'd have:

 

 Prev YearCurrentYear `1Year 2Year 3Year 4Year 5
fy15_name_field149515253555862
fy15_name_field223252627283035
fy16_name_field151525657596368
fy16_name_field225282931353639
fy17_name_field152545558606162

 

If so, I think you'll want to find a way to get this field name into the workflow as a field, and group by it as I did with the dates. I can help you out if I can see a more realistic example of the data.

 

Hope this helps,

hellyars
13 - Pulsar

Okay, definitely not there ...but some small progress

 

I switched the filetype input from xlsx to my the xml.

The filenames are cy##_1234abc.zzz.  I want the ## after cy.  

I stripped everything going into the RegEx tool except filename.  

I then reset the RegEx tool to (d\{2}) from (d\{4}).  This grabs the 2 digits that follow cy (aka fy).  

 

 

I can't sort out what the Join tool is doing just yet.  It is current set on Join by Record position, which is only finding 4 joins and spitting out 29+ left.

tcroberts
12 - Quasar

I had the join tool on Record Position because there is one date I want to add per file. so the first file corresponds with the first date, the second with the second, etc.

 

The reason it's spitting out the 29+ is because there are multiple rows for each input, obviously something I should have accounted for in my workflow.

 

Are you able to get any sort of identifier for the file that a row comes from? This could be used in conjunction with the RecordID tool to join that identifier on RecordID instead of by position.

 

Is the number of rows in each file the same every time? If so, you could use the RecordID tool and a Formula tool to change it so that the first, lets say 10 rows are assigned to 1, the next 10 are assigned to 2, etc. Then, join on the RecordID from the RegEx.

 

The formula you'd want here is: CEIL([RecordID]/<numberofrowsperrecord>)

 

The attached workflow implements this, but again this will only work if the number of rows per file is constant across all files.

hellyars
13 - Pulsar

Let me try a different angle.

I can input all the files together in one step using *.zzz.    

File Name aside, there is a string field called Calendar Year.  It's just the 4 digit year.  

So, I can get all the files into the data stream in one fell swoop.  

 

This is still dummy data.  But this is essentially what the data looks like going into the two (2) Select tools before the Union.  (The workflow was originally developed to accept 1 year.  In that case, PY, CY, Yr1... are annually changed using the Select tools.)

 

 

 

File Name

Calendar Year (V_String)

Project

Prior Year

Current Year

Year 1

Year 2

Year 3

Year 4

Year 5

cy15_idnmbr.zzz

2015

Tacos

 $         10 

 $             12 

 $         14 

 $         16 

 $         18 

 $         20 

 $         22 

cy16_idnmbr.zzz

2016

Tacos

 $         12 

 $             14 

 $         15 

 $         17 

 $         19 

 $         21 

 $         23 

cy17_idnmbr.zzz

2017

Tacos

 $         13 

 $             12 

 $         15 

 $         18 

 $         19 

 $         22 

 $         23 

cy15_idnmbr.zzz

2018

Tacos

 $         17 

 $             15 

 $         19 

 $         22 

 $         24 

 $         23 

 $         26 

Screen Shot 2018-09-07 at 1.59.43 PM.png

 

 

tcroberts
12 - Quasar

It seems like calendar year isn't quite a unique identifier for the file, since the 2015 file has both 2018 and 2015.

 

If the calendar year 20## was always the same as the cy##, then, you could create the Calendar year from your RegEx output by adding the string "20" in front.

 

However, in this case I'm not sure you really need to do anything with the file name, so I must be missing something. You could simply do the transposing, renaming, and cross-tabbing using the Calendar Year field as the Grouping Field.

 

I'm sure I've missed something here so please let me know what it is,

 

Cheers!

 

EDIT: 

Labels