Challenge #33: Reshaping Nielsen Data
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hopefully everyone had fun with the Grand Prix challenges and a few of you are ready to jump in the ring next year for the 2017 Grand Prix at Inspire in Las Vegas. The solution for the final lap (challenge #32) is HERE.
Let’s dial it back this week and look another real world example of using Alteryx to reshape data into a usable format for analysis.
Use Case: A radio station is trying to analyze data they receive from Nielsen disclosing the number of listeners the station has on a weekly basis by program. The challenge is that the data is formatted in a way that makes it challenging to use for analytics.
Objective: Reshape the data detailing the listening stats for the 30 programs listed in the data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Ok, quick and dirty (hold your comments, please).
To do this I simply filtered the rows with the unique combination of null fields, then dynamically renamed the columns by taking the headers from the first row.
I then isolated the rows containing the pertinent data relating to the two age groups:
As you can see, I used the "In-Tab" row as a divider. I then split into two streams for the two age brackets. I then generated a row id and named this "Code".
I then joined the two age bracket streams to the code stream, renamed the columns and filtered out any null lines.
Not the most elegent solution, but I'm on a post-Christmas diet so leave me alone.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
:-) I went the longest way round possible
2 questions on the provided solution (from @TaraM / @GeneR)
- it seems like there's a first row of ratings data which this solution misses. Both @mceleavey and I have hacked around this by starting to count the ratings data from 0 instead of 1
- one field is called "P25_34Consume" instead of "P25_54Consume"
I've attached my super-awkward, super-long-way-round solution :-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
My (verrrry long) solution... this one was brutal for me... ouch. :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
More ETL fun?