Dublin, IRL

Welcome to the Dublin User Group

Click in the JOIN GROUP button in Home to follow our news and attend our events!

Weekly exercise #4

Carlos_A
8 - Asteroid

Hello all,

Hope you enjoyed last week's exercise, you can find the solution here. We will continue with lap 2 this week and do some analytics on our baseball dataset, if you did not finish last week's exercise or just want to join us this week, use the provided solution to create the dataset.

Objective: Please use Alteryx to answer the following questions –

    1) Who has the youngest team and what is the average age?
    2) Which team has the best (lowest) average pitcher rank?
    3) What is the Pearson correlation between 2015 Team Rank and Hitter Rank?
    4) Please build a table containing the count of players on each fantasy team by position, the sum of home runs for the hitters by fantasy team, and sum of wins for the pitchers by fantasy team, ordered by 2015 team rank.

Good luck, and don't forget to post your solutions.

4 REPLIES 4
Carlos_A
8 - Asteroid

Hi all,

 

Hope you enjoyed this exercise. This time it seems that the developers didn't provide an official solution, but I have attached my solution to this post.

 

 

Ollie
7 - Meteor

Thank again for posting Carlos!

 

My attempt is the same as yours except I felt it better to filter out the rows representing pitchers before calculating the Pearson coefficient...

Ollie
7 - Meteor

Oops made a mistake - corrected in the attached

Carlos_A
8 - Asteroid

Hi Ollie,

 

Good stuff! I only have one concern about filtering out null values for [Hitter Rank]: perhaps these null values were not filled because the players ranked very low and did not make the cut. Doing a quick max in that field I found that there are 367 ranks, so perhaps these null values ranked below that and that's why they were not classified.

 

Filtering the results biases the dataset, and skews the results. As seen in your results, it descreased the value for the correlation, which could indicate that there is an inverse linear relationship between the fields, where in fact (from the results given by the workflow) it seems that there is no correlation and therefore, not linear relationship between the fields.

 

What are your thoughts?