Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #301: New Year's Resolutions

Adarsh_R3
8 - Asteroid

Happy New all 

 

Spoiler
Adarsh_R3_1-1641303843393.png

 

davidhardister
8 - Asteroid

davidhardister_0-1641305734892.png

 

TH
8 - Asteroid

I started working through this on 3 Jan 2022, and I have come to believe that the data is not sufficient to answer the questions.

While it is true that the input spreadsheet data is in a very strange format, that is, I believe, not much of a problem. I posit that the main issue is that the data is already aggregated, and so is unable and unfit to answer the questions posed.

 

The numbers in the data are numbers that indicate the probability of a particular resolution given a particular demographic category. That is, assuming that we are already working with only people who are in a particular demographic category, what is the probability that the person chose a particular resolution? This is evident because the sum of the values across an entire resolution row is 1 (rounding means that this isn't always exactly true, but it's the idea).

The questions to be answered are -

1. What were the top 3 New Year's resolutions for 2019?
2. What percentage of fitness-related resolutions (exercising more, losing weight, eating healthier and improving health) were made by suburban men and women?
3. Which group of people were most likely to keep their resolutions in 2018?

 

1 - "Top" in what sense? Putting aside that everyone only got to choose one resolution (which is a mite strange if you ask me), we don't know the objective number of people from the data. Finding out what percentage of all the people chose a particular resolution is not possible if the limit of our knowledge is what percent of a particular demographic chose a particular resolution. The answer to the question would depend on how much of the population was in each demographic category. As an example, having the population evenly distributed between the geographic regions will provide a different answer than if the distribution is highly skewed.

2 - The question asks "What is the probability of being a suburban man/woman given that the person already made a fitness-related resolution?". The information that is provided is the probability of having made a fitness-related resolution assuming (or given) that the person is a suburban man/woman. Those two probabilities are directly reversed from each other, and coming up with an answer is a classic case of applying Bayes' Theorem. Unfortunately, Bayes' Theorem requires knowing the overall probability of being a suburban man or woman in this context, and we are not given that information. We might assume a particular value, but without more information about the data we can't know how close or far our assumption is from the reality of the sample.

3 - The question, restated, is - assuming we know the probability of "keep" (or "yes") given an arbitrary demographic group, which value, over all the demographic divisions, is greatest? This can be read straight off the spreadsheet because it is exactly what the spreadsheet is giving us.

 

I've enjoyed the Alteryx community challenges mostly so far. This one, though, doesn't seem doable to me without more information.

I'm interested in other perspectives. What do you think?

scoles0617
8 - Asteroid

My solution:

Justin_B
8 - Asteroid

HNY!

Spoiler
justin_butler_0-1641313806205.png

 

nklassen
5 - Atom

I know it's not the prettiest but I'm really enjoying learning and stretching my Alteryx skills. 

NicoleKlassen
5 - Atom

Solution

juliabarale002
8 - Asteroid

🎆

summit_view
8 - Asteroid
Spoiler
cgrace_0-1641345475362.png

 

alexnajm
16 - Nebula
16 - Nebula

I agree with @TH that the instructions could be better, but here is my solution. Not the easiest interpretability.

Spoiler
Challenge 301.PNG