Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Extracting data from the middle of a txt file

mwill683
5 - Atom

Hello team,

 

Im a bit stuck on a task that im attempting to solve through Alteryx.

 

My tasks;
1. Count the number of IDs under the "DOEtoken" feature -any user ID with more than 5 tokens gets an email

2. Count the number of days the user has the license under the "DOEtoken" feature -any user with more than 8 days gets an email

3. Identify IDs under the "ccmpower" features with hostnames of WGC* and hpclog*

 

I am able to get the information for the "DOEtoken" feature in my workflow.

 

The "ccmpower" feature is more difficult since it can start anywhere in the middle of the file. The layout of the file is always the same.

 

I have included a sample layout of the file and my workflow.

 

Any assistance is appreciated.

4 REPLIES 4
danilang
18 - Pollux
18 - Pollux

Hi @mwill683 

 

Use a MultiRow tool to mark the various groups with a GroupName column with a formula like 

 

If startwith(trim([data]), "Users of ") then
   Substring([data],findstring([data],"Users of ")+8))
Else
    [Row-1:GroupName]
Endif

Then you add a filter to extract the group you want

 

Dan

 

mwill683
5 - Atom

Hello @danilang,

 

My apologies, I am still new to Alteryx.

 

Should I add a second multi-row tool after the record id tool?

 

also what needs to be the 'data' variable?

 

Thanks again

danilang
18 - Pollux
18 - Pollux

Hi @mwill683 

 

When you have a report style input with differing formats on various rows, it always a good idea to load the entire file and then deal with the various sections within the file.  There's always a header row that needs to be copied down and applied to various data rows.  In your case, the header row is "Users of xxxxxx" row. xxxxxx is that value that needs to be applied to each of the following data rows to be able to group them further in the workflow.   Reading the file in as csv as splitting on input, usually ends up breaking up important rows.  Starting the input at a specific row, can lead to important data loss, i.e. the initial "Users of" row as well as inadvertently losing data.   By starting your import on row 23, you ended up with "BJEON2" as a column header, when it should have been a token value 

 

danilang_0-1615639412846.png

 

Attached is an example that deals with your data.  Read in the data as flat file to preserve the leading spaces.  Get the group names for the "Users in" rows and copy them down until the group ends.  After removing the header rows, split the remaining rows to columns.  At this point the data is similar to your original where you split Login_date to month and day, the main deference is that you all the token values and they're grouped.  The summarize at the top groups by GroupName and token to give you results like this

 

danilang_1-1615639974570.png

 

The bottom branch gives you the groups that don't have any users.  

 

Dan

     

mwill683
5 - Atom

Hello @danilang,

 

This works great for me. Thanks

 

I'm a bit lost on the construction but I will treat this as a learning opportunity as I breakdown your workflow.

 

Thanks again.

Labels