I need to regularly process a large flat file (csv) containing a few thousand rows. The data is messy, but generally organized by [unit], [group], and [activity] where each file has up to 10 units, each unit has upwards of 300 groups and each group has 8 or so activities. The number of records (rows) for each activity is highly variable and not known in advance. I am interested in all the rows of any particular [unit] AND [group] AND [activity] WHEN [activity]=”act_nn_end” AND [result1]=”val1” AND [result2]=”val2” (for example the act_2_* rows in the table below)
This could probably done in a database by a SELECT for the condition and retaining the [unit], [group], and [activity] (wildcard the act_nn*) in a result table that I could then JOIN with the input to get the rows I need. But I am wondering if this could also be done in Alteryx. However I cannot see how to do this and thought I would ask the community for some advice or suggestions here.
I have an example of the file below. Since I will get 4-5 flat files a week it would certainly be helpful to automate selecting the portions of interest out of each file without dropping them into a db every time. And there will be multiple combinations of [result1] and [result2] that will be interesting – certain combinations will indicate a particular action to take for example.
Thanks for looking at this. Any hints or suggestions would be appreciated,
Tim
timestamp | unit | group_id | Activity | result1 | result2 | other fields …. (60) |
hh:mm:00 | u1 | gid1 | act_1_start | [Null] | [Null] | [Null] |
hh:mm:10 | u1 | gid1 | act_1 | [Null] | [Null] | [Null] or data |
hh:mm:20 | u1 | gid1 | act_1 | [Null] | [Null] | [Null] or data |
hh:mm:30 | u1 | gid1 | act_1 | [Null] | [Null] | [Null] or data |
hh:mm:40 | u1 | gid1 | act_1 | [Null] | [Null] | [Null] or data |
hh:mm:50 | u1 | gid1 | act_1 | [Null] | [Null] | [Null] or data |
hh:mm:52 | u1 | gid1 | act_1_end | val1 | val2 | [Null] or data |
hh:mm:60 | u1 | gid1 | [Null] | [Null] | [Null] | … |
hh:mm:ss | u1 | gid1 | .. | … | … | … |
hh:m2:00 | u1 | gid1 | act_2_start | [Null] | [Null] | [Null] |
hh:m2:10 | u1 | gid1 | act_2 | [Null] | [Null] | [Null] or data |
hh:m2:20 | u1 | gid1 | act_2 | [Null] | [Null] | [Null] or data |
hh:m2:38 | u1 | gid1 | act_2_end | val1 | val4 | [Null] or data |
hh:m2:40 | u1 | gid1 | … | [Null] | [Null] | … |
hh:mm:ss | u1 | … | … | … | … | … |
hh:m3:00 | u1 | gid2 | act_1_start | [Null] | [Null] | [Null] |