Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #36: Data Cleansing Extract Authors

GeneR
Alteryx Alumni (Retired)

The link to the solution for last challenge #35 is HERE

 

Use Case:  An analytical consulting company downloads medical journal publication data from the web and would like to extract all of the authors for the listed entries.

 

The text input contains details about each article where FAU indicates the author name for the article – in most case there are multiple authors. The details of each article are contained in lines that begin with PMID and end with an empty line.

 

Objective: Parse out each article PMID and list each author in sequential columns as seen in the Results.yxmd file.

TaraM
Alteryx Alumni (Retired)

A solution has been included 

Spoiler
2016-08-15 08_47_29-Alteryx Designer x64 BETA - DataPrep_ExtractAuthors_Intermediate_Solution.yxmd_.png
Tara McCoy
Naledi
7 - Meteor

 

Spoiler

ALteryx weekly exercise 36.PNG

My approach

 

GeneR
Alteryx Alumni (Retired)

Very nice use of the 'select records' tool @Naledi

MarqueeCrew
20 - Arcturus
20 - Arcturus

@JoeM,

 

Another attempt to earn my challenge badge(s).

 

Cheers,

Mark

 

P.S.  I'm glad to see @TaraM is so active in the challenges :)

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
SeanAdams
17 - Castor
17 - Castor

Slightly different approach

 

Spoiler
split the data into 2 streams - article headers; and authors
Then processed the authors into a cross-tab in 2 ways (for practice)
- first was a simple ArticleAuthorID (using a multi-row formula) and then crosstab
- Second was to use a summarize to concatenate into one delimited field, then use TextToColumns to do the same as a crosstab


NicoleJohnson
ACE Emeritus
ACE Emeritus

Solved!

 

Spoiler
WeeklyChallenge36.JPG
estherb47
15 - Aurora
15 - Aurora

I used the Summarize tool to concatenate all of the names per PMID, and then parsed back into columns with Text to Cols

Spoiler
Spoilerimage.png
LordNeilLord
15 - Aurora

This was fun, it would have been much easier if you could group when creating a RecordId.....

Spoiler
But as an easy alternative I gave every record a count of 1 and then used running total :)

Weekly Challenge 36.png
nick_ceneviva
11 - Bolide

Solution attached