Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #116: A Symphony of Parsing Tools!

ChristineB
Alteryx Alumni (Retired)

A solution for last week's challenge has been posted here!  

 

The NY Philharmonic, as various ensembles, has been performing for audiences around the world for over 175 years!  Wow!  This week's Challenge asks you to parse the data for each of their programs (not for the past 175 years...that file was HUGE!) from 2011 - 2017.  For each program, identify the concert information (Date, Location, Time, etc), as well as the pieces played during that program and the solo performers (if applicable).  Note: the posted solution has removed records representing an intermission. 

c275c98f2ab93e65e47df5518a000279.jpg

 

 

 

mmongeon
8 - Asteroid

I think there is an issue with the output provided, it doesn't match what I see in the XML.

See the Performance Date in the screenshot below: 

 

issue with output.JPG

ChristineB
Alteryx Alumni (Retired)

Hmmm...let me investigate!  Thanks for letting me know @mmongeon

ChristineB
Alteryx Alumni (Retired)

Good catch @mmongeon!  I did something silly with my DateTime tool.  The start file has been updated with the correct (at least until someone catches something else!) start file. 

joe_yang
6 - Meteoroid

Hi,

 

This could just be my lack, but I am wondering why Program ID 11640 was excluded from the Output when the Performance date landed on 2011-09-07? See image below:

 

Capture.PNG

mmongeon
8 - Asteroid

Here is my solution.

I'm actually getting more data than was provided in the given Output, but, from the spot checks I've performed, I think my additional records are valid.

 

 

Spoiler
There may be easier solutions to get the data out of the children levels... but it works.

workflow 116.JPG

 

ChristineB
Alteryx Alumni (Retired)

Another good catch @joe_yang!  I've updated the start file again.  Also, I think I found a point of discrepancy: I opted to remove the records containing "Intermission" in the column "interval".  That means I removed records that @mmongeon's solution includes (and that are included in the original xml file).  I'm loving all the intense data investigation this Challenge requires (especially on under-caffeinated Mondays....)!  

ivoller
12 - Quasar

I didn't get exactly the same output. Saw that there were some changes and that Intermissions were to be excluded. Couldn't face going back again. My solution is messy and I should have summarized earlier in the process.

 

Spoiler
2018-05-14_17-41-16.png

 @mmongeon solution is prettier and almost certainly performs better.

terry10
11 - Bolide

My XML parsing revealed that each program could have more that one concert, so my final counts were different from the challenge output. 

(e.g. Program 11633 has 3 concert date&times and the 3 pieces were played at each concert, so I ended up with 9 records.)

 

3 concert dates for Program 116333 concert dates for Program 11633

 

 

My output for program 11633My output for program 11633Solution Output for program 11633Solution Output for program 11633

 

Natasha
9 - Comet

Here is my solution. I agree with @terry10 that there might be multiple instances of concertInfo, though the solution seems to keep only the first one so I did the same.

 

Spoiler
image.png