Alteryx Designer Desktop Discussions

Anitta · ‎12-03-2016

Hi community

this might be a simple question:

if a line contains the same patterns more times than one, how can i parse all incidents to multiple rows.

If the pattern is (\d.{0,4}.?[Kk][Gg]) it will find and parse 15,5 kg one time - but becuase i work in unstructured data lines lookes different per eg.: record 34: live_weight 2500kg, loin 2,8 Kg, rear 18kg, vegi_25lbs etc.

I would like to make sure that all entities containing [kg]is parsed out into separate collumns in the same process so :

RECORD 34: "live_weight 2500kg, loin 2,8 Kg, rear 18kg, vegi_25lbs etc".becomes --> col1: 2500kg; col2: 2,8 Kg; col3: 18kg and in addition i would also like to get a consolidated count : parsed = 3 out of 4

2) i have found that specification of several of variants of the same product into same line. Eg.: in this eg. in fact there is 4 different motors:

RECORD: MOTOR 2500RPM, 2,8KW, 18NM, 7,5A,346V 1750RPM, 3,3KW, 18NM, 7,5A,398V 2000RPM, 3,7KW, 18NM, 7,6A,447V - HOW can i make a split regex based on recognition of same string-pattern -> so i get :

MOTOR 2500RPM, 2,8KW, 18NM, 7,5A,346V

MOTOR 1750RPM, 3,3KW, 18NM, 7,5A,398V

MOTOR 2000RPM, 3,7KW, 18NM, 7,6A,447V

BECAUSE I LOOK THROUGH a lot of lines solving this will be a tremendous help.

br anitta

JohnJPS · ‎12-03-2016

Hi @Anitta,

Not sure but I'm guessing you're using the formula tool and parsing using the functions there? Consider using the Parse RegEx tool, (https://help.alteryx.com/10.5/RegEx.htm) which accept a RegEx, and has the option to split results to rows.

Joe_Mako · ‎12-03-2016

I attempted to parse and attached is what I came up with as a first draft. If it does not work for other data records, please add some sample data and the workflow can be adjusted. Please let me know, thanks!

Anitta · ‎12-04-2016

hi John

you are right i do use the parsing tool and i am stumbeling into issues using regex. Because the lines are containing the same info that i want to parse - using the tool is limiting me in making sure that i have all data for the same pattern because it only retrieves the left-one.

I upload a sample of the data here - lines contains data for actually three different items and using the parsingtool only tells me about one of them.

br anitta

Anitta · ‎12-04-2016

Hi Joe

I am amazed. Thank you for your solutions. I am jsut now trying your solution on big sample data in order to scrutinize in order find out if i can connect your solutions in the same workflow or if i have to do it in separate steps. But your solutions actually do seem to perform:-)

Since the data is very unstructured i am not all together sure how data behaves accross databases - but this is one of the biggest issues so far.

Thanks anitta

Anitta · ‎12-04-2016

Hi Joe

I am amazed. Thank you for your solutions. I am jsut now trying your solution on big sample data in order to scrutinize in order find out if i can connect your solutions in the same workflow or if i have to do it in separate steps. But your solutions actually do seem to perform:-)

Since the data is very unstructured i am not all together sure how data behaves accross databases - but this is one of the biggest issues so far.

Thanks anitta

Anitta · ‎12-06-2016

Hi

I have been working on your suggested solution because i need two more steps in order to analyse data.

1) need to shuffle all pattern-items to colums and

2) need to have values for each header-text concatenated(if available then value if not then null)

Bascially i would like to have for each unique row to have pattern shuffled to columns and for each pattern to have the value attached in a second row. In this process i need to scrutnize the relation between "matching patterns" if they are the same or not (partly manually), but ultimately i would like to have the columns for matching pattens concatenated and the values (null/actual value ) listed underneath in separate row.From where i am now i cannot use the cross tab tool because i have only name-txt and no values, so i was thinking to parse out the values in a separate step and afterwards concatenate based on record id.

But do you see another way around this? I have attached the additional step so far in work flow.

br anitta

Alteryx Designer Desktop Discussions

how to parse the same information from the same line