Alteryx Designer Desktop Discussions

Alteryx Designer Desktop Discussions

Change field name based on value in filename?

Re: Unable to get an output

Re: Extracting the list of sheet names across mult...

Re: Chaining Apps

Re: Unable to read in all raw xml from an excel fi...

Unable to read in all raw xml from an excel file

hellyars · ‎09-07-2018

Okay, I have an annual file.

The format of the filename is "fy##-name.abc"

Within each file year's are identified as Prior Year, Current Year, Year One, Year Two, Year Three, Year Four, Year Five.

How can I build a formula that dynamically uses the two ## digits in the file name to convert Prior Year to ## - 2, Current Year to ## -1, Year One to ##, Year Two to ## + 1, etc.?

tcroberts · ‎09-07-2018

I think your best bet here is to use information from the Directory Tool to grab the year.

I've attached a sample workflow that uses this method:

Spoiler

What I've done is use Directory Tool -> Dynamic input to read my data in (you'll likely have to make the directory tool look for more specific files, or add a filtering step before the Dynamic input/parse tool.

I then create a target column, with the name current year, so I can match to the transposed columns of the input file. Once I make this match, I use the multi-row formula to rename them. I've created a duplicate of the Name column here, but I don't think it's necessary in general, I just wanted to show that the numbers were generated correctly.

For this it is important that the "Values for Rows that don't Exist" is set to NULL, since I use a Null check to set Prior Year correctly.

After removing unnecessary information, I CrossTab back into the original format.

Let me know if you have any questions, or this workflow doesn't work for you for whatever reason.

Cheers!

hellyars · ‎09-07-2018

I can kind of follow what you are trying to do. But, I think there might be another challenge. The PY, CY, Yr1... fields are not available at source input. I have to parse to extract them.

Once parsed, the data looks like this.

Individually, I simply use the select tool to rename PY, CY, Yr1... to the correct year. But, I am trying to avoid building a long lengthy formula that has to be updated every year.

	14	15	16	17	18	19	20	21	22	23
fy15_filename	10	15	20	25	30	35	40
fy16_filename		10	15	20	25	30	35	40
fy17_filename			10	15	20	25	30	35	40
fy18_filename				10	15	20	25	30	35	40

tcroberts · ‎09-07-2018

I think you may still be able to do something like what I've done above, you'll just need to use the GroupBy feature and group by Filename.

Instead of using the find/replace Tool, I'm just joining in the results of my Regex before the Transpose so I can group by year, then my formula just creates that same column that the previous find replace did, replacing Current Year with the year, and setting all the others to the Name column. The multi-row tool is unchanged, and the CrossTab just needs to be set to GroupBy the RegExOut field as well.

I've attached an updated workflow, let me know if this helps.

hellyars · ‎09-07-2018

Sorry, don't mean to be retarded. The example has no input data. So, I can't see what you are joining. And, the real data has dozens of fields. So, its throwing me a curve ball.

tcroberts · ‎09-07-2018

I've just created some files with your naming scheme filled with 1s. as the data.

If you can add some mock data of what an input to this process would look like, I could put together a more realistic example workflow.

When you say it has dozens of fields, are you saying that there is like Year+12, or you're saying that there are multiple fields that need to be transformed like this?

i.e. you'd have:

	Prev Year	Current	Year `1	Year 2	Year 3	Year 4	Year 5
fy15_name_field1	49	51	52	53	55	58	62
fy15_name_field2	23	25	26	27	28	30	35
fy16_name_field1	51	52	56	57	59	63	68
fy16_name_field2	25	28	29	31	35	36	39
fy17_name_field1	52	54	55	58	60	61	62

If so, I think you'll want to find a way to get this field name into the workflow as a field, and group by it as I did with the dates. I can help you out if I can see a more realistic example of the data.

Hope this helps,

hellyars · ‎09-07-2018

Okay, definitely not there ...but some small progress

I switched the filetype input from xlsx to my the xml.

The filenames are cy##_1234abc.zzz. I want the ## after cy.

I stripped everything going into the RegEx tool except filename.

I then reset the RegEx tool to (d\{2}) from (d\{4}). This grabs the 2 digits that follow cy (aka fy).

I can't sort out what the Join tool is doing just yet. It is current set on Join by Record position, which is only finding 4 joins and spitting out 29+ left.

tcroberts · ‎09-07-2018

I had the join tool on Record Position because there is one date I want to add per file. so the first file corresponds with the first date, the second with the second, etc.

The reason it's spitting out the 29+ is because there are multiple rows for each input, obviously something I should have accounted for in my workflow.

Are you able to get any sort of identifier for the file that a row comes from? This could be used in conjunction with the RecordID tool to join that identifier on RecordID instead of by position.

Is the number of rows in each file the same every time? If so, you could use the RecordID tool and a Formula tool to change it so that the first, lets say 10 rows are assigned to 1, the next 10 are assigned to 2, etc. Then, join on the RecordID from the RegEx.

The formula you'd want here is: CEIL([RecordID]/<numberofrowsperrecord>)

The attached workflow implements this, but again this will only work if the number of rows per file is constant across all files.

hellyars · ‎09-07-2018

Let me try a different angle.

I can input all the files together in one step using *.zzz.

File Name aside, there is a string field called Calendar Year. It's just the 4 digit year.

So, I can get all the files into the data stream in one fell swoop.

This is still dummy data. But this is essentially what the data looks like going into the two (2) Select tools before the Union. (The workflow was originally developed to accept 1 year. In that case, PY, CY, Yr1... are annually changed using the Select tools.)

File Name	Calendar Year (V_String)	Project	Prior Year	Current Year	Year 1	Year 2	Year 3	Year 4	Year 5
cy15_idnmbr.zzz	2015	Tacos	$ 10	$ 12	$ 14	$ 16	$ 18	$ 20	$ 22
cy16_idnmbr.zzz	2016	Tacos	$ 12	$ 14	$ 15	$ 17	$ 19	$ 21	$ 23
cy17_idnmbr.zzz	2017	Tacos	$ 13	$ 12	$ 15	$ 18	$ 19	$ 22	$ 23
cy15_idnmbr.zzz	2018	Tacos	$ 17	$ 15	$ 19	$ 22	$ 24	$ 23	$ 26

tcroberts · ‎09-07-2018

It seems like calendar year isn't quite a unique identifier for the file, since the 2015 file has both 2018 and 2015.

If the calendar year 20## was always the same as the cy##, then, you could create the Calendar year from your RegEx output by adding the string "20" in front.

However, in this case I'm not sure you really need to do anything with the file name, so I must be missing something. You could simply do the transposing, renaming, and cross-tabbing using the Calendar Year field as the Grouping Field.

I'm sure I've missed something here so please let me know what it is,

Cheers!

EDIT:

fy15_filename

fy16_filename

fy17_filename

fy18_filename