community
cancel
Showing results for 
Search instead for 
Did you mean: 
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Weekly Challenge
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Unable to display your progress at this time. Please try again a little later, or contact an administrator if you continue to see this error.
Getting started with Designer? | Start your journey with our new Learning Path!

Challenge #36: Data Cleansing Extract Authors

Highlighted
Alteryx Alumni (Retired)

The link to the solution for last challenge #35 is HERE

 

Use Case:  An analytical consulting company downloads medical journal publication data from the web and would like to extract all of the authors for the listed entries.

 

The text input contains details about each article where FAU indicates the author name for the article – in most case there are multiple authors. The details of each article are contained in lines that begin with PMID and end with an empty line.

 

Objective: Parse out each article PMID and list each author in sequential columns as seen in the Results.yxmd file.

Creative Director
Creative Director

A solution has been included 

Spoiler
2016-08-15 08_47_29-Alteryx Designer x64 BETA - DataPrep_ExtractAuthors_Intermediate_Solution.yxmd_.png
Tara McCoy
Alteryx Certified Partner

 

Spoiler

ALteryx weekly exercise 36.PNG

My approach

 

Alteryx Alumni (Retired)

Very nice use of the 'select records' tool @Naledi

Alteryx Certified Partner
Alteryx Certified Partner

@JoeM,

 

Another attempt to earn my challenge badge(s).

 

Cheers,

Mark

 

P.S.  I'm glad to see @TaraM is so active in the challenges :)

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Nebula
Nebula

Slightly different approach

 

Spoiler
split the data into 2 streams - article headers; and authors
Then processed the authors into a cross-tab in 2 ways (for practice)
- first was a simple ArticleAuthorID (using a multi-row formula) and then crosstab
- Second was to use a summarize to concatenate into one delimited field, then use TextToColumns to do the same as a crosstab


Magnetar
Magnetar

Solved!

 

Spoiler
WeeklyChallenge36.JPG
Pulsar
Pulsar

I used the Summarize tool to concatenate all of the names per PMID, and then parsed back into columns with Text to Cols

Spoiler
Spoilerimage.png
Alteryx Certified Partner

This was fun, it would have been much easier if you could group when creating a RecordId.....

Spoiler
But as an easy alternative I gave every record a count of 1 and then used running total :)

Weekly Challenge 36.png
Alteryx Certified Partner

Solution attached